Xin Shen, Guoliang Yuan, Huibing Wang, Xianping Fu
{"title":"Unsupervised clustering optimization-based efficient attention in YOLO for underwater object detection","authors":"Xin Shen, Guoliang Yuan, Huibing Wang, Xianping Fu","doi":"10.1007/s10462-025-11218-6","DOIUrl":null,"url":null,"abstract":"<div><p>Underwater object detection is a prerequisite for underwater robots to realize ocean exploration and autonomous grasping. However, underwater detection tasks face some inevitable interference factors, such as poor imaging quality, strong environment randomness, and high organism concealment. These phenomena will lead to strong underwater background interference and weak underwater object perception, which greatly aggravates the difficulty of underwater object detection. In order to deal with the above problems, we propose an unsupervised clustering optimization-based efficient attention (UCOEA). Different from the channel-wise strategy, cross-channel strategy and channel grouping strategy, we design a channel clustering strategy, which achieves autonomous dynamic screening of channel information by using the K-Means algorithm. Same types of channel information with high redundancy are learned uniformly to share the same operation. Different types of channel information with high specificity are learned independently to avoid channel noise information interference. Different from the single spatial strategy and multiple spatial strategy, we design a spatial clustering strategy, which achieves autonomous dynamic stripping of spatial information by using the EM algorithm. This strategy can extract multiple required spatial information at one time from different spatial locations. We further assign learnable weight parameters to distinguish dominant information and auxiliary information, which can alleviate spatial noise information interference. Our strategies can better balance additional cost overhead and information processing quality, which is crucial for the proposed attention to achieve fast and accurate underwater information calibration. In order to achieve high-precision and real-time underwater object detection, we propose a combined system of UCOEA underwater adapter and one-stage YOLO detector, which can efficiently detect small, medium and large targets at the same time. Extensive experiments demonstrate the effectiveness of our work. More importantly, we publish an underwater detection dataset DLMU2024 with low image continuity and high data diversity, which provides reliable support for the rapid development of underwater detection research. Our dataset is available at https://github.com/shenxin-dlmu/DLMU2024.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 7","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11218-6.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11218-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Underwater object detection is a prerequisite for underwater robots to realize ocean exploration and autonomous grasping. However, underwater detection tasks face some inevitable interference factors, such as poor imaging quality, strong environment randomness, and high organism concealment. These phenomena will lead to strong underwater background interference and weak underwater object perception, which greatly aggravates the difficulty of underwater object detection. In order to deal with the above problems, we propose an unsupervised clustering optimization-based efficient attention (UCOEA). Different from the channel-wise strategy, cross-channel strategy and channel grouping strategy, we design a channel clustering strategy, which achieves autonomous dynamic screening of channel information by using the K-Means algorithm. Same types of channel information with high redundancy are learned uniformly to share the same operation. Different types of channel information with high specificity are learned independently to avoid channel noise information interference. Different from the single spatial strategy and multiple spatial strategy, we design a spatial clustering strategy, which achieves autonomous dynamic stripping of spatial information by using the EM algorithm. This strategy can extract multiple required spatial information at one time from different spatial locations. We further assign learnable weight parameters to distinguish dominant information and auxiliary information, which can alleviate spatial noise information interference. Our strategies can better balance additional cost overhead and information processing quality, which is crucial for the proposed attention to achieve fast and accurate underwater information calibration. In order to achieve high-precision and real-time underwater object detection, we propose a combined system of UCOEA underwater adapter and one-stage YOLO detector, which can efficiently detect small, medium and large targets at the same time. Extensive experiments demonstrate the effectiveness of our work. More importantly, we publish an underwater detection dataset DLMU2024 with low image continuity and high data diversity, which provides reliable support for the rapid development of underwater detection research. Our dataset is available at https://github.com/shenxin-dlmu/DLMU2024.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.