{"title":"Application of Split Coordinate Channel Attention Embedding U2Net in Salient Object Detection","authors":"Yuhuan Wu, Yonghong Wu","doi":"10.3390/a17030109","DOIUrl":null,"url":null,"abstract":"Salient object detection (SOD) aims to identify the most visually striking objects in a scene, simulating the function of the biological visual attention system. The attention mechanism in deep learning is commonly used as an enhancement strategy which enables the neural network to concentrate on the relevant parts when processing input data, effectively improving the model’s learning and prediction abilities. Existing saliency object detection methods based on RGB deep learning typically treat all regions equally by using the extracted features, overlooking the fact that different regions have varying contributions to the final predictions. Based on the U2Net algorithm, this paper incorporates the split coordinate channel attention (SCCA) mechanism into the feature extraction stage. SCCA conducts spatial transformation in width and height dimensions to efficiently extract the location information of the target to be detected. While pixel-level semantic segmentation based on annotation has been successful, it assigns the same weight to each pixel which leads to poor performance in detecting the boundary of objects. In this paper, the Canny edge detection loss is incorporated into the loss calculation stage to improve the model’s ability to detect object edges. Based on the DUTS and HKU-IS datasets, experiments confirm that the proposed strategies effectively enhance the model’s detection performance, resulting in a 0.8% and 0.7% increase in the F1-score of U2Net. This paper also compares the traditional attention modules with the newly proposed attention, and the SCCA attention module achieves a top-three performance in prediction time, mean absolute error (MAE), F1-score, and model size on both experimental datasets.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/a17030109","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Salient object detection (SOD) aims to identify the most visually striking objects in a scene, simulating the function of the biological visual attention system. The attention mechanism in deep learning is commonly used as an enhancement strategy which enables the neural network to concentrate on the relevant parts when processing input data, effectively improving the model’s learning and prediction abilities. Existing saliency object detection methods based on RGB deep learning typically treat all regions equally by using the extracted features, overlooking the fact that different regions have varying contributions to the final predictions. Based on the U2Net algorithm, this paper incorporates the split coordinate channel attention (SCCA) mechanism into the feature extraction stage. SCCA conducts spatial transformation in width and height dimensions to efficiently extract the location information of the target to be detected. While pixel-level semantic segmentation based on annotation has been successful, it assigns the same weight to each pixel which leads to poor performance in detecting the boundary of objects. In this paper, the Canny edge detection loss is incorporated into the loss calculation stage to improve the model’s ability to detect object edges. Based on the DUTS and HKU-IS datasets, experiments confirm that the proposed strategies effectively enhance the model’s detection performance, resulting in a 0.8% and 0.7% increase in the F1-score of U2Net. This paper also compares the traditional attention modules with the newly proposed attention, and the SCCA attention module achieves a top-three performance in prediction time, mean absolute error (MAE), F1-score, and model size on both experimental datasets.