{"title":"基于yolov11的增强型河流航空图像检测研究","authors":"Lei Zhang;Ao Zheng;Xiaoyan Sun;Zhipeng Sun","doi":"10.1109/LGRS.2025.3576640","DOIUrl":null,"url":null,"abstract":"The unmanned aerial vehicle (UAV) encounters challenges in detecting similar small targets during target detection tasks. Consequently, the current target detection algorithms struggle to accurately identify river debris, overgrazing, and suspected sand mining activities. To address the issues of low precision and high complexity associated with small target detection in the existing models, this article introduces an enhanced version of YOLOv11, referred to as PAB-YOLOv11. First, the C3K2-PPA module is employed to replace the C3K2 module within the backbone network. Additionally, a multibranch fusion approach is utilized to enhance the model’s feature extraction capabilities for small targets across various scales. The attention for fine-grained classification (AFGC) attention mechanism is integrated between the neck network and the detection head to improve the recognition of similar objects. This is achieved by emphasizing local fine features and dynamically adjusting the distribution of attention. The experimental results demonstrate that, on the dataset obtained from the Sanggan River basin, the mAP@0.5 of PAB-YOLOv11 reaches 64.9%, reflecting an improvement of 2.1% over the original YOLOv11 model. Compared to the three mainstream models, YOLOv5s, YOLOv8s, and YOLOv11n, PAB-YOLOv11 achieves improvements of 3.1%, 3.2%, and 2.6% in mAP@0.5, respectively. When compared to more advanced models, such as RT-DETR and DINO, PAB-YOLOv11 also shows enhancements in mAP@0.5 of 5.1% and 2.8%, respectively. These findings indicate that the PAB-YOLOv11 model proposed in this study is an effective method for river channel inspection.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced YOLOv11-Based River Aerial Image Detection Research\",\"authors\":\"Lei Zhang;Ao Zheng;Xiaoyan Sun;Zhipeng Sun\",\"doi\":\"10.1109/LGRS.2025.3576640\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The unmanned aerial vehicle (UAV) encounters challenges in detecting similar small targets during target detection tasks. Consequently, the current target detection algorithms struggle to accurately identify river debris, overgrazing, and suspected sand mining activities. To address the issues of low precision and high complexity associated with small target detection in the existing models, this article introduces an enhanced version of YOLOv11, referred to as PAB-YOLOv11. First, the C3K2-PPA module is employed to replace the C3K2 module within the backbone network. Additionally, a multibranch fusion approach is utilized to enhance the model’s feature extraction capabilities for small targets across various scales. The attention for fine-grained classification (AFGC) attention mechanism is integrated between the neck network and the detection head to improve the recognition of similar objects. This is achieved by emphasizing local fine features and dynamically adjusting the distribution of attention. The experimental results demonstrate that, on the dataset obtained from the Sanggan River basin, the mAP@0.5 of PAB-YOLOv11 reaches 64.9%, reflecting an improvement of 2.1% over the original YOLOv11 model. Compared to the three mainstream models, YOLOv5s, YOLOv8s, and YOLOv11n, PAB-YOLOv11 achieves improvements of 3.1%, 3.2%, and 2.6% in mAP@0.5, respectively. When compared to more advanced models, such as RT-DETR and DINO, PAB-YOLOv11 also shows enhancements in mAP@0.5 of 5.1% and 2.8%, respectively. These findings indicate that the PAB-YOLOv11 model proposed in this study is an effective method for river channel inspection.\",\"PeriodicalId\":91017,\"journal\":{\"name\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"volume\":\"22 \",\"pages\":\"1-5\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11023549/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11023549/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Enhanced YOLOv11-Based River Aerial Image Detection Research
The unmanned aerial vehicle (UAV) encounters challenges in detecting similar small targets during target detection tasks. Consequently, the current target detection algorithms struggle to accurately identify river debris, overgrazing, and suspected sand mining activities. To address the issues of low precision and high complexity associated with small target detection in the existing models, this article introduces an enhanced version of YOLOv11, referred to as PAB-YOLOv11. First, the C3K2-PPA module is employed to replace the C3K2 module within the backbone network. Additionally, a multibranch fusion approach is utilized to enhance the model’s feature extraction capabilities for small targets across various scales. The attention for fine-grained classification (AFGC) attention mechanism is integrated between the neck network and the detection head to improve the recognition of similar objects. This is achieved by emphasizing local fine features and dynamically adjusting the distribution of attention. The experimental results demonstrate that, on the dataset obtained from the Sanggan River basin, the mAP@0.5 of PAB-YOLOv11 reaches 64.9%, reflecting an improvement of 2.1% over the original YOLOv11 model. Compared to the three mainstream models, YOLOv5s, YOLOv8s, and YOLOv11n, PAB-YOLOv11 achieves improvements of 3.1%, 3.2%, and 2.6% in mAP@0.5, respectively. When compared to more advanced models, such as RT-DETR and DINO, PAB-YOLOv11 also shows enhancements in mAP@0.5 of 5.1% and 2.8%, respectively. These findings indicate that the PAB-YOLOv11 model proposed in this study is an effective method for river channel inspection.