{"title":"MiTU-Net:一种用于前视声纳图像分割的高效混合变压器u型网络","authors":"Yingshuo Liang, Xingyu Zhu, Jianlei Zhang","doi":"10.1109/CCAI55564.2022.9807763","DOIUrl":null,"url":null,"abstract":"The segmentation of forward-looking sonar (FLS) image could assist underwater vehicles to recognize and measure underwater crash objects. Due to the complex noise and blurred object edge information in FLS image, the accurate segmentation result requires the model to have strong feature extraction ability. The CNN-based semantic segmentation networks focus too much on local information, which may amplify the complex noise. And their computational overhead is high. To address these problems, we construct a novel efficient Mix Transformer U-like network named MiTU-Net for FLS image segmentation. In addition, we introduce the online hard example mining (OHEM) crossentropy loss function to improve the learning ability of hard samples in dataset. We have carried out a series of experiments on the self-made FLS dataset. The experimental results demonstrate that MiTU-Net has better performance than other methods, and it shows effectiveness and robustness for FLS image segmentation task.","PeriodicalId":340195,"journal":{"name":"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"MiTU-Net: An Efficient Mix Transformer U-like Network for Forward-looking Sonar Image Segmentation\",\"authors\":\"Yingshuo Liang, Xingyu Zhu, Jianlei Zhang\",\"doi\":\"10.1109/CCAI55564.2022.9807763\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The segmentation of forward-looking sonar (FLS) image could assist underwater vehicles to recognize and measure underwater crash objects. Due to the complex noise and blurred object edge information in FLS image, the accurate segmentation result requires the model to have strong feature extraction ability. The CNN-based semantic segmentation networks focus too much on local information, which may amplify the complex noise. And their computational overhead is high. To address these problems, we construct a novel efficient Mix Transformer U-like network named MiTU-Net for FLS image segmentation. In addition, we introduce the online hard example mining (OHEM) crossentropy loss function to improve the learning ability of hard samples in dataset. We have carried out a series of experiments on the self-made FLS dataset. The experimental results demonstrate that MiTU-Net has better performance than other methods, and it shows effectiveness and robustness for FLS image segmentation task.\",\"PeriodicalId\":340195,\"journal\":{\"name\":\"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCAI55564.2022.9807763\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCAI55564.2022.9807763","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MiTU-Net: An Efficient Mix Transformer U-like Network for Forward-looking Sonar Image Segmentation
The segmentation of forward-looking sonar (FLS) image could assist underwater vehicles to recognize and measure underwater crash objects. Due to the complex noise and blurred object edge information in FLS image, the accurate segmentation result requires the model to have strong feature extraction ability. The CNN-based semantic segmentation networks focus too much on local information, which may amplify the complex noise. And their computational overhead is high. To address these problems, we construct a novel efficient Mix Transformer U-like network named MiTU-Net for FLS image segmentation. In addition, we introduce the online hard example mining (OHEM) crossentropy loss function to improve the learning ability of hard samples in dataset. We have carried out a series of experiments on the self-made FLS dataset. The experimental results demonstrate that MiTU-Net has better performance than other methods, and it shows effectiveness and robustness for FLS image segmentation task.