{"title":"UAMFDet:用于水下多模态目标检测的声光融合技术","authors":"Haojie Chen, Zhuo Wang, Hongde Qin, Xiaokai Mu","doi":"10.1002/rob.22432","DOIUrl":null,"url":null,"abstract":"Underwater object detection serves as a crucial means for autonomous underwater vehicles (AUVs) to gain awareness of their surroundings. Currently, AUVs predominantly depend on underwater optical cameras or sonar sensing techniques to furnish vital information sources for subsequent tasks such as underwater rescue and mining exploration. However, the influence of underwater light attenuation or significant background noise often leads to the failure of either the acoustic or optical sensor. Consequently, the traditional single‐modal object detection network, which relies exclusively on either the optical or acoustic modality, struggles to adapt to the varying complexities of underwater environments. To address this challenge, this paper proposes a novel underwater acoustic‐optical fusion‐based underwater multi‐modal object detection paradigm termed UAMFDet, which fuses highly misaligned acoustic‐optical features in the spatial dimension at both the fine‐grained level and the instance level. First, we propose a multi‐modal deformable self‐aligned feature fusion module to adaptively capture feature dependencies between multi‐modal targets, and perform self‐aligned multi‐modal fine‐grained feature fusion by differential fusion. Then a multi‐modal instance‐level feature matching network is designed. It matches multi‐modal instance features by a lightweight cross‐attention mechanism and performs differential fusion to achieve instance‐level feature fusion. In addition, we establish a data set dedicated to underwater acoustic‐optical fusion object detection tasks called UAOF, and conduct a large number of experiments on the UAOF data set to verify the effectiveness of UAMFDet.","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"15 1","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"UAMFDet: Acoustic‐Optical Fusion for Underwater Multi‐Modal Object Detection\",\"authors\":\"Haojie Chen, Zhuo Wang, Hongde Qin, Xiaokai Mu\",\"doi\":\"10.1002/rob.22432\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Underwater object detection serves as a crucial means for autonomous underwater vehicles (AUVs) to gain awareness of their surroundings. Currently, AUVs predominantly depend on underwater optical cameras or sonar sensing techniques to furnish vital information sources for subsequent tasks such as underwater rescue and mining exploration. However, the influence of underwater light attenuation or significant background noise often leads to the failure of either the acoustic or optical sensor. Consequently, the traditional single‐modal object detection network, which relies exclusively on either the optical or acoustic modality, struggles to adapt to the varying complexities of underwater environments. To address this challenge, this paper proposes a novel underwater acoustic‐optical fusion‐based underwater multi‐modal object detection paradigm termed UAMFDet, which fuses highly misaligned acoustic‐optical features in the spatial dimension at both the fine‐grained level and the instance level. First, we propose a multi‐modal deformable self‐aligned feature fusion module to adaptively capture feature dependencies between multi‐modal targets, and perform self‐aligned multi‐modal fine‐grained feature fusion by differential fusion. Then a multi‐modal instance‐level feature matching network is designed. It matches multi‐modal instance features by a lightweight cross‐attention mechanism and performs differential fusion to achieve instance‐level feature fusion. In addition, we establish a data set dedicated to underwater acoustic‐optical fusion object detection tasks called UAOF, and conduct a large number of experiments on the UAOF data set to verify the effectiveness of UAMFDet.\",\"PeriodicalId\":192,\"journal\":{\"name\":\"Journal of Field Robotics\",\"volume\":\"15 1\",\"pages\":\"\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Field Robotics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1002/rob.22432\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Field Robotics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1002/rob.22432","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
UAMFDet: Acoustic‐Optical Fusion for Underwater Multi‐Modal Object Detection
Underwater object detection serves as a crucial means for autonomous underwater vehicles (AUVs) to gain awareness of their surroundings. Currently, AUVs predominantly depend on underwater optical cameras or sonar sensing techniques to furnish vital information sources for subsequent tasks such as underwater rescue and mining exploration. However, the influence of underwater light attenuation or significant background noise often leads to the failure of either the acoustic or optical sensor. Consequently, the traditional single‐modal object detection network, which relies exclusively on either the optical or acoustic modality, struggles to adapt to the varying complexities of underwater environments. To address this challenge, this paper proposes a novel underwater acoustic‐optical fusion‐based underwater multi‐modal object detection paradigm termed UAMFDet, which fuses highly misaligned acoustic‐optical features in the spatial dimension at both the fine‐grained level and the instance level. First, we propose a multi‐modal deformable self‐aligned feature fusion module to adaptively capture feature dependencies between multi‐modal targets, and perform self‐aligned multi‐modal fine‐grained feature fusion by differential fusion. Then a multi‐modal instance‐level feature matching network is designed. It matches multi‐modal instance features by a lightweight cross‐attention mechanism and performs differential fusion to achieve instance‐level feature fusion. In addition, we establish a data set dedicated to underwater acoustic‐optical fusion object detection tasks called UAOF, and conduct a large number of experiments on the UAOF data set to verify the effectiveness of UAMFDet.
期刊介绍:
The Journal of Field Robotics seeks to promote scholarly publications dealing with the fundamentals of robotics in unstructured and dynamic environments.
The Journal focuses on experimental robotics and encourages publication of work that has both theoretical and practical significance.