Oceanship: A Large-Scale Dataset for Underwater Audio Target Recognition
Abstract: The recognition of underwater audio plays a significant role in identifying a vessel while it is in motion. Underwater target recognition tasks have a wide range of applications in areas such as marine environmental protection, detection of ship radiated noise, underwater noise control, and coastal vessel dispatch. The traditional UATR task involves training a network to extract features from audio data and predict the vessel type. The current UATR dataset exhibits shortcomings in both duration and sample quantity. In this paper, we propose Oceanship, a large-scale and diverse underwater audio dataset. This dataset comprises 15 categories, spans a total duration of 121 hours, and includes comprehensive annotation information such as coordinates, velocity, vessel types, and timestamps. We compiled the dataset by crawling and organizing original communication data from the Ocean Communication Network (ONC) database between 2021 and 2022. While audio retrieval tasks are well-established in general audio classification, they have not been explored in the context of underwater audio recognition. Leveraging the Oceanship dataset, we introduce a baseline model named Oceannet for underwater audio retrieval. This model achieves a recall at 1 (R@1) accuracy of 67.11% and a recall at 5 (R@5) accuracy of 99.13% on the Deepship dataset.
- “Machine learning techniques for enhancing maritime surveillance based on gmti radar and ais,” in 2018 19th International Radar Symposium (IRS). IEEE, 2018, pp. 1–10.
- “Development of tactile imaging for underwater structural damage detection,” Sensors, vol. 19, no. 18, pp. 3925, 2019.
- “Rethinking illumination for person re-identification: A unified view,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 4731–4739.
- “Deep multimodal representation learning for generalizable person re-identification,” Machine Learning, pp. 1–19, 2023.
- “Review of underwater cable shape detection,” Journal of Atmospheric and Oceanic Technology, vol. 33, no. 3, pp. 597–606, 2016.
- Hector A Orengo, “A brave new world for archaeological survey: Automated machine learning-based potsherd detection using high-resolution drone imagery,” Journal of Archaeological Science, vol. 112, pp. 105013, 2019.
- “Integration of sonar and optical camera images using deep neural network for fish monitoring,” Aquacultural Engineering, vol. 86, pp. 102000, 2019.
- “Acoustic classification of surface and underwater vessels in the ocean using supervised machine learning,” Sensors, vol. 19, no. 16, pp. 3492, 2019.
- “Less is more: Learning from synthetic data with fine-grained attributes for person re-identification,” ACM Transactions on Multimedia Computing, Communications and Applications, vol. 19, no. 5s, pp. 1–20, 2023.
- “Colo-scrl: Self-supervised contrastive representation learning for colonoscopic video retrieval,” in 2023 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2023, pp. 1–6.
- “The effects of ship noise on marine mammals—a review,” Frontiers in Marine Science, vol. 6, pp. 606, 2019.
- “Parameters affecting noise emitted by ships moving in port areas,” Sustainability, vol. 12, no. 20, pp. 8742, 2020.
- “A novel approach to port noise characterization using an acoustic camera,” Science of The Total Environment, vol. 808, pp. 151903, 2022.
- “Shipsear: An underwater vessel noise database,” Applied Acoustics, vol. 113, pp. 64–69, 2016.
- “Deepship: An underwater acoustic benchmark dataset and a separable convolution based autoencoder for classification,” Expert Systems with Applications, vol. 183, pp. 115270, 2021.
- “Unsupervised domain adaptation through synthesis for person re-identification,” in 2020 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2020, pp. 1–6.
- “Rethinking person re-identification via semantic-based pretraining,” ACM Transactions on Multimedia Computing, Communications and Applications, vol. 20, no. 3, pp. 1–17, 2023.
- “An investigation of preprocessing filters and deep learning methods for vessel type classification with underwater acoustic data,” IEEE Access, vol. 10, pp. 117582–117596, 2022.
- “Ualf: A learnable front-end for intelligent underwater acoustic classification system,” Ocean Engineering, vol. 264, pp. 112394, 2022.
- “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
- “Imagebind: One embedding space to bind them all,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 15180–15190.
- Edward J Hu and Shen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021.
- Po-Yao Huang and Xu, “Masked autoencoders that listen,” Advances in Neural Information Processing Systems, vol. 35, pp. 28708–28720, 2022.
- Yuan Gong and Lai, “Ssast: Self-supervised audio spectrogram transformer,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2022, vol. 36, pp. 10699–10709.
- Neil Zeghidour and Teboul, “Leaf: A learnable frontend for audio classification,” arXiv preprint arXiv:2101.08596, 2021.
- “Beats: Audio pre-training with acoustic tokenizers,” arXiv preprint arXiv:2212.09058, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.