Papers
Topics
Authors
Recent
Search
2000 character limit reached

Transformer-based Autoencoder with ID Constraint for Unsupervised Anomalous Sound Detection

Published 13 Oct 2023 in cs.SD and eess.AS | (2310.08950v1)

Abstract: Unsupervised anomalous sound detection (ASD) aims to detect unknown anomalous sounds of devices when only normal sound data is available. The autoencoder (AE) and self-supervised learning based methods are two mainstream methods. However, the AE-based methods could be limited as the feature learned from normal sounds can also fit with anomalous sounds, reducing the ability of the model in detecting anomalies from sound. The self-supervised methods are not always stable and perform differently, even for machines of the same type. In addition, the anomalous sound may be short-lived, making it even harder to distinguish from normal sound. This paper proposes an ID constrained Transformer-based autoencoder (IDC-TransAE) architecture with weighted anomaly score computation for unsupervised ASD. Machine ID is employed to constrain the latent space of the Transformer-based autoencoder (TransAE) by introducing a simple ID classifier to learn the difference in the distribution for the same machine type and enhance the ability of the model in distinguishing anomalous sound. Moreover, weighted anomaly score computation is introduced to highlight the anomaly scores of anomalous events that only appear for a short time. Experiments performed on DCASE 2020 Challenge Task2 development dataset demonstrate the effectiveness and superiority of our proposed method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Koizumi, Y., Saito, S., Uematsu, H., Kawachi, Y., Harada, N.: Unsupervised detection of anomalous sound based on deep learning and the neyman-pearson lemma. IEEE/ACM Transactions on Audio, Speech, and Language Processing 27(1), 212–224 (2018) Chalapathy and Chawla [2019] Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407 (2019) Nunes [2021] Nunes, E.C.: Anomalous sound detection with machine learning: A systematic review. arXiv preprint arXiv:2102.07820 (2021) Guan et al. [2023] Guan, J., Liu, Y., Zhu, Q., Zheng, T., Han, J., Wang, W.: Time-weighted frequency domain audio representation with GMM estimator for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Foggia et al. [2015] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407 (2019) Nunes [2021] Nunes, E.C.: Anomalous sound detection with machine learning: A systematic review. arXiv preprint arXiv:2102.07820 (2021) Guan et al. [2023] Guan, J., Liu, Y., Zhu, Q., Zheng, T., Han, J., Wang, W.: Time-weighted frequency domain audio representation with GMM estimator for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Foggia et al. [2015] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Nunes, E.C.: Anomalous sound detection with machine learning: A systematic review. arXiv preprint arXiv:2102.07820 (2021) Guan et al. [2023] Guan, J., Liu, Y., Zhu, Q., Zheng, T., Han, J., Wang, W.: Time-weighted frequency domain audio representation with GMM estimator for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Foggia et al. [2015] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Liu, Y., Zhu, Q., Zheng, T., Han, J., Wang, W.: Time-weighted frequency domain audio representation with GMM estimator for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Foggia et al. [2015] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  2. Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407 (2019) Nunes [2021] Nunes, E.C.: Anomalous sound detection with machine learning: A systematic review. arXiv preprint arXiv:2102.07820 (2021) Guan et al. [2023] Guan, J., Liu, Y., Zhu, Q., Zheng, T., Han, J., Wang, W.: Time-weighted frequency domain audio representation with GMM estimator for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Foggia et al. [2015] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Nunes, E.C.: Anomalous sound detection with machine learning: A systematic review. arXiv preprint arXiv:2102.07820 (2021) Guan et al. [2023] Guan, J., Liu, Y., Zhu, Q., Zheng, T., Han, J., Wang, W.: Time-weighted frequency domain audio representation with GMM estimator for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Foggia et al. [2015] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Liu, Y., Zhu, Q., Zheng, T., Han, J., Wang, W.: Time-weighted frequency domain audio representation with GMM estimator for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Foggia et al. [2015] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  3. Nunes, E.C.: Anomalous sound detection with machine learning: A systematic review. arXiv preprint arXiv:2102.07820 (2021) Guan et al. [2023] Guan, J., Liu, Y., Zhu, Q., Zheng, T., Han, J., Wang, W.: Time-weighted frequency domain audio representation with GMM estimator for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Foggia et al. [2015] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Liu, Y., Zhu, Q., Zheng, T., Han, J., Wang, W.: Time-weighted frequency domain audio representation with GMM estimator for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Foggia et al. [2015] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  4. Guan, J., Liu, Y., Zhu, Q., Zheng, T., Han, J., Wang, W.: Time-weighted frequency domain audio representation with GMM estimator for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Foggia et al. [2015] Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  5. Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Transactions on Intelligent Transportation Systems 17(1), 279–288 (2015) Li et al. [2018] Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  6. Li, Y., Li, X., Zhang, Y., Liu, M., Wang, W.: Anomalous sound detection using deep audio representation and a BLSTM network for audio surveillance of roads. IEEE Access 6, 58043–58055 (2018) Chung et al. [2013] Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  7. Chung, Y., Oh, S., Lee, J., Park, D., Chang, H.-H., Kim, S.: Automatic detection and recognition of pig wasting diseases using sound data in audio surveillance systems. Sensors 13(10), 12929–12942 (2013) Henze et al. [2019] Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  8. Henze, D., Gorishti, K., Bruegge, B., Simen, J.-P.: AudioForesight: A process model for audio predictive maintenance in industrial environments. In: Proceedings of International Conference On Machine Learning And Applications (ICMLA), pp. 352–357 (2019). IEEE Oh and Yun [2018] Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  9. Oh, D.Y., Yun, I.D.: Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308–1321 (2018) Park and Yun [2018] Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  10. Park, Y., Yun, I.D.: Fast adaptive RNN encoder–decoder for anomaly detection in SMD assembly machine. Sensors 18(10), 3573–3583 (2018) Koizumi et al. [2020] Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  11. Koizumi, Y., Kawaguchi, Y., Imoto, K., Nakamura, T., Nikaido, Y., Tanabe, R., Purohit, H., Suefusa, K., Endo, T., Yasuda, M., Harada, N.: Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Tokyo, Japan, pp. 81–85 (2020) Kawaguchi et al. [2021] Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  12. Kawaguchi, Y., Imoto, K., Koizumi, Y., Harada, N., Niizumi, D., Dohi, K., Tanabe, R., Purohit, H., Endo, T.: Description and discussion on DCASE2021 challenge task 2: Unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 186–190 (2021) Dohi et al. [2022] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  13. Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Yamamoto, M., Kawaguchi, Y.: Description and discussion on DCASE2022 challenge task 2: Unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Dohi et al. [2023] Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  14. Dohi, K., Imoto, K., Harada, N., Niizumi, D., Koizumi, Y., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Description and discussion on DCASE 2023 challenge task 2: First-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv preprint arXiv: 2305.07828 (2023) Zabihi et al. [2016] Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  15. Zabihi, M., Rad, A.B., Kiranyaz, S., Gabbouj, M., Katsaggelos, A.K.: Heart sound anomaly and quality detection using ensemble of neural networks without segmentation. In: Proceedings of Computing in Cardiology Conference (CinC), pp. 613–616 (2016) Tagawa et al. [2015] Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  16. Tagawa, T., Tadokoro, Y., Yairi, T.: Structured denoising autoencoder for fault detection and analysis. In: Proceedings of Asian Conference on Machine Learning (ACML), pp. 96–111 (2015) Marchi et al. [2015a] Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  17. Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1996–2000 (2015). IEEE Marchi et al. [2015b] Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  18. Marchi, E., Vesperini, F., Weninger, F., Eyben, F., Squartini, S., Schuller, B.: Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2015). IEEE Suefusa et al. [2020] Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  19. Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020). IEEE Wichern et al. [2021] Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  20. Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J.: Anomalous sound detection using attentive neural processes. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 186–190 (2021). IEEE Kim et al. [2019] Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  21. Kim, H., Mnih, A., Schwarz, J., Garnelo, M., Eslami, A., Rosenbaum, D., Vinyals, O., Teh, Y.W.: Attentive neural processes. arXiv preprint arXiv:1901.05761 (2019) Van Truong et al. [2021] Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  22. Van Truong, H., Hieu, N.C., Giao, P.N., Phong, N.X.: Unsupervised detection of anomalous sound for machine condition monitoring using fully connected U-Net. Journal of ICT Research & Applications 15(1), 41–55 (2021) Giri et al. [2020a] Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  23. Giri, R., Cheng, F., Helwani, K., Tenneti, S.V., Isik, U., Krishnaswamy, A.: Group masked autoencoder based density estimator for audio anomaly detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 51–55 (2020) Giri et al. [2020b] Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  24. Giri, R., Tenneti, S.V., Helwani, K., Cheng, F., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation. Technical report, DCASE2020 Challenge (2020) Zavrtanik et al. [2021] Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  25. Zavrtanik, V., Kristan, M., Skočaj, D.: DRAEM - A discriminatively trained reconstruction embedding for surface anomaly detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 8330–8339 (2021) Kapka [2020] Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  26. Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. arXiv preprint arXiv:2007.05314 (2020) Kuroyanagi et al. [2021] Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  27. Kuroyanagi, I., Hayashi, T., Adachi, Y., Yoshimura, T., Takeda, K., Toda, T.: An ensemble approach to anomalous sound detection based on Conformer-based autoencoder and binary classifier incorporated with metric learning. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 110–114 (2021) Giri et al. [2020] Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  28. Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 46–50 (2020) Wilkinghoff [2021] Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  29. Wilkinghoff, K.: Combining multiple distributions based on sub-cluster adacos for anomalous sound detection under domain shifted conditions. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Barcelona, Spain, pp. 55–59 (2021) Venkatesh et al. [2022] Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  30. Venkatesh, S., Wichern, G., Subramanian, A., Le Roux, J.: Improved domain generalization via disentangled multi-task learning in unsupervised anomalous sound detection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, Nancy, France (2022) Liu et al. [2022] Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  31. Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022). IEEE Guan et al. [2023] Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  32. Guan, J., Xiao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine ID based contrastive learning pretraining. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE Hejing et al. [2023] Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  33. Hejing, Z., Jian, G., Qiaoxi, Z., Feiyang, X., Youde, L.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. In: Proceedings of INTERSPEECH, pp. 336–340 (2023) Xiao et al. [2022] Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  34. Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The DCASE2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Technical report, DCASE2022 Challenge (2022) Wei et al. [2022] Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  35. Wei, Y., Guan, J., Lan, H., Wang, W.: Anomalous sound detection system with self-challenge and metric evaluation for DCASE2022 challenge task 2. Technical report, DCASE2022 Challenge (2022) Dohi et al. [2021] Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  36. Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021). IEEE Tabak and Turner [2013] Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  37. Tabak, E.G., Turner, C.V.: A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics 66(2), 145–164 (2013) Dinh et al. [2014] Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  38. Dinh, L., Krueger, D., Bengio, Y.: Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014) Kingma and Dhariwal [2018] Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  39. Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2018) Papamakarios et al. [2017] Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  40. Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  41. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2017) Kolesnikov and Lampert [2016] Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  42. Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 695–711 (2016). Springer Koizumi et al. [2017] Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  43. Koizumi, Y., Saito, S., Uematsu, H., Harada, N.: Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma. In: Proceedings of European Signal Processing Conference (EUSIPCO), pp. 698–702 (2017). IEEE Glorot et al. [2011] Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  44. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 315–323 (2011). PMLR Murphy [2012] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  45. Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press (2012) Purohit et al. [2019] Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  46. Purohit, H., Tanabe, R., Ichige, T., Endo, T., Nikaido, Y., Suefusa, K., Kawaguchi, Y.: MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. In: Proceedings of Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, pp. 209–213 (2019) Koizumi et al. [2019] Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  47. Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection. In: Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019). IEEE Perez-Castanos et al. [2020] Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  48. Perez-Castanos, S., Naranjo-Alcazar, J., Zuccarello, P., Cobos, M.: Anomalous sound detection using unsupervised and semi-supervised autoencoders and gammatone audio representation. arXiv preprint arXiv:2006.15321 (2020) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  49. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Citations (5)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.