Yuchen Zheng, Yuxin Jing, Jufeng Zhao, Guangmang Cui
{"title":"LAM-YOLO: Drones-based small object detection on lighting-occlusion attention mechanism YOLO","authors":"Yuchen Zheng, Yuxin Jing, Jufeng Zhao, Guangmang Cui","doi":"10.1016/j.cviu.2025.104489","DOIUrl":null,"url":null,"abstract":"<div><div>Drone-based target detection presents inherent challenges, including the high density and overlap of targets in drone images, as well as the blurriness of targets under varying lighting conditions, which complicates accurate identification. Traditional methods often struggle to detect numerous small, densely packed targets against complex backgrounds. To address these challenges, we propose LAM-YOLO, an object detection model specifically designed for drone-based applications. First, we introduce a light-occlusion attention mechanism to enhance the visibility of small targets under diverse lighting conditions. Additionally, we incorporate Involution modules to improve feature layer interactions. Second, we employ an improved SIB-IoU as the regression loss function to accelerate model convergence and enhance localization accuracy. Finally, we implement a novel detection strategy by introducing two auxiliary detection heads to better identify smaller-scale targets. Our quantitative results demonstrate that LAM-YOLO outperforms methods such as Faster R-CNN, YOLOv11, and YOLOv12 in terms of [email protected] and [email protected]:0.95 on the VisDrone2019 public dataset. Compared to the original YOLOv8, the average precision increases by 7.1%. Additionally, the proposed SIB-IoU loss function not only accelerates convergence speed during training but also improves average precision compared to the traditional loss function.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"261 ","pages":"Article 104489"},"PeriodicalIF":3.5000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314225002127","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Drone-based target detection presents inherent challenges, including the high density and overlap of targets in drone images, as well as the blurriness of targets under varying lighting conditions, which complicates accurate identification. Traditional methods often struggle to detect numerous small, densely packed targets against complex backgrounds. To address these challenges, we propose LAM-YOLO, an object detection model specifically designed for drone-based applications. First, we introduce a light-occlusion attention mechanism to enhance the visibility of small targets under diverse lighting conditions. Additionally, we incorporate Involution modules to improve feature layer interactions. Second, we employ an improved SIB-IoU as the regression loss function to accelerate model convergence and enhance localization accuracy. Finally, we implement a novel detection strategy by introducing two auxiliary detection heads to better identify smaller-scale targets. Our quantitative results demonstrate that LAM-YOLO outperforms methods such as Faster R-CNN, YOLOv11, and YOLOv12 in terms of [email protected] and [email protected]:0.95 on the VisDrone2019 public dataset. Compared to the original YOLOv8, the average precision increases by 7.1%. Additionally, the proposed SIB-IoU loss function not only accelerates convergence speed during training but also improves average precision compared to the traditional loss function.
期刊介绍:
The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views.
Research Areas Include:
• Theory
• Early vision
• Data structures and representations
• Shape
• Range
• Motion
• Matching and recognition
• Architecture and languages
• Vision systems