A spatial features and weight adjusted loss infused Tiny YOLO for shadow detection

IF 2.7 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Signal Processing-Image Communication Pub Date : 2025-09-22 DOI:10.1016/j.image.2025.117408

Akhil Kumar , R. Dhanalakshmi , R. Rajesh , R. Sendhil

{"title":"A spatial features and weight adjusted loss infused Tiny YOLO for shadow detection","authors":"Akhil Kumar , R. Dhanalakshmi , R. Rajesh , R. Sendhil","doi":"10.1016/j.image.2025.117408","DOIUrl":null,"url":null,"abstract":"<div><div>Shadow detection in computer vision is challenging due to the difficulty in distinguishing shadows from similarly colored or dark objects. Variations in lighting, background textures, and object shapes further complicate accurate detection. This work introduces NS-YOLO, a novel Tiny YOLO variant designed for the specific task of shadow detection under varying conditions. The new architecture includes a small-scale feature extraction network improvised by global attention mechanism, multi-scale spatial attention, and a spatial pyramid pooling block, while preserving effective multi-scale contextual information. In addition, a weight-adjusted CIOU loss function is introduced for enhancing localization accuracy. The proposed architecture addresses shadow detection by effectively capturing both fine details and global context, helping distinguish shadows from similar dark regions. The enhanced loss function improves boundary localization, reducing false detections and improving accuracy. The NS-YOLO is trained end-to-end from scratch on the SBU and ISTD datasets. The experiments show that NS-YOLO achieves a detection accuracy (mAP) of 59.2 % while utilizing only 35.6 BFLOPs. In comparison with existing lightweight YOLO variants that is, Tiny YOLO and YOLO Nano models proposed between 2017–2025, NS-YOLO shows a relative mAP improvement of 2.5 - 50.1 %. These results highlight its efficiency and effectiveness and make it particularly suitable for deployment on resource-limited edge devices in real-time scenarios, e.g., video surveillance and advanced driver-assistance systems (ADAS).</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"140 ","pages":"Article 117408"},"PeriodicalIF":2.7000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing-Image Communication","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0923596525001547","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Shadow detection in computer vision is challenging due to the difficulty in distinguishing shadows from similarly colored or dark objects. Variations in lighting, background textures, and object shapes further complicate accurate detection. This work introduces NS-YOLO, a novel Tiny YOLO variant designed for the specific task of shadow detection under varying conditions. The new architecture includes a small-scale feature extraction network improvised by global attention mechanism, multi-scale spatial attention, and a spatial pyramid pooling block, while preserving effective multi-scale contextual information. In addition, a weight-adjusted CIOU loss function is introduced for enhancing localization accuracy. The proposed architecture addresses shadow detection by effectively capturing both fine details and global context, helping distinguish shadows from similar dark regions. The enhanced loss function improves boundary localization, reducing false detections and improving accuracy. The NS-YOLO is trained end-to-end from scratch on the SBU and ISTD datasets. The experiments show that NS-YOLO achieves a detection accuracy (mAP) of 59.2 % while utilizing only 35.6 BFLOPs. In comparison with existing lightweight YOLO variants that is, Tiny YOLO and YOLO Nano models proposed between 2017–2025, NS-YOLO shows a relative mAP improvement of 2.5 - 50.1 %. These results highlight its efficiency and effectiveness and make it particularly suitable for deployment on resource-limited edge devices in real-time scenarios, e.g., video surveillance and advanced driver-assistance systems (ADAS).

查看原文本刊更多论文

一个空间特征和重量调整损失注入微小的YOLO阴影检测

由于难以从相似颜色或深色物体中区分阴影，因此阴影检测在计算机视觉中具有挑战性。光照、背景纹理和物体形状的变化进一步复杂化了准确的检测。这项工作介绍了NS-YOLO，一种新颖的微型YOLO变体，专为在不同条件下的阴影检测特定任务而设计。新架构包括基于全局注意机制的小尺度特征提取网络、多尺度空间注意和空间金字塔池块，同时保留有效的多尺度上下文信息。此外，为了提高定位精度，还引入了权重调整的CIOU损失函数。所提出的建筑通过有效地捕捉细节和全局背景来解决阴影检测问题，帮助从类似的黑暗区域区分阴影。增强的损失函数改进了边界定位，减少了误检，提高了精度。NS-YOLO在SBU和ISTD数据集上从头开始进行端到端训练。实验表明，NS-YOLO在仅利用35.6个BFLOPs的情况下，检测精度达到59.2%。与2017-2025年间提出的微型YOLO和纳米YOLO车型相比，NS-YOLO的相对mAP提高了2.5 - 50.1%。这些结果突出了其效率和有效性，使其特别适合部署在实时场景中资源有限的边缘设备上，例如视频监控和高级驾驶员辅助系统（ADAS）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Signal Processing-Image Communication 工程技术-工程：电子与电气

CiteScore

8.40

自引率

2.90%

发文量

138

审稿时长

5.2 months

期刊介绍： Signal Processing: Image Communication is an international journal for the development of the theory and practice of image communication. Its primary objectives are the following: To present a forum for the advancement of theory and practice of image communication. To stimulate cross-fertilization between areas similar in nature which have traditionally been separated, for example, various aspects of visual communications and information systems. To contribute to a rapid information exchange between the industrial and academic environments. The editorial policy and the technical content of the journal are the responsibility of the Editor-in-Chief, the Area Editors and the Advisory Editors. The Journal is self-supporting from subscription income and contains a minimum amount of advertisements. Advertisements are subject to the prior approval of the Editor-in-Chief. The journal welcomes contributions from every country in the world. Signal Processing: Image Communication publishes articles relating to aspects of the design, implementation and use of image communication systems. The journal features original research work, tutorial and review articles, and accounts of practical developments. Subjects of interest include image/video coding, 3D video representations and compression, 3D graphics and animation compression, HDTV and 3DTV systems, video adaptation, video over IP, peer-to-peer video networking, interactive visual communication, multi-user video conferencing, wireless video broadcasting and communication, visual surveillance, 2D and 3D image/video quality measures, pre/post processing, video restoration and super-resolution, multi-camera video analysis, motion analysis, content-based image/video indexing and retrieval, face and gesture processing, video synthesis, 2D and 3D image/video acquisition and display technologies, architectures for image/video processing and communication.