基于自适应微小目标和轻量网络的驾驶员分心检测

IF 2.7 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Signal Processing-Image Communication Pub Date : 2025-05-15 DOI:10.1016/j.image.2025.117342

Shuangshuang Gu , Bin Wen , Shiyao Chen , Yuanyuan Li , Guanqiu Qi , Linhong Shuai , Zhiqin Zhu

{"title":"基于自适应微小目标和轻量网络的驾驶员分心检测","authors":"Shuangshuang Gu , Bin Wen , Shiyao Chen , Yuanyuan Li , Guanqiu Qi , Linhong Shuai , Zhiqin Zhu","doi":"10.1016/j.image.2025.117342","DOIUrl":null,"url":null,"abstract":"<div><div>Driver distraction detection is critical to reducing road traffic accidents and increasing the efficiency of advanced driver assistance systems. Real-time lightweight models are especially important for in-vehicle devices with limited computing resources. However, most existing methods focus on designing lighter network architectures and ignore the performance loss when detecting tiny targets. In order to realize the collaborative optimization of tiny target detection accuracy and network lightweight, a driver distraction detection method ATD<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>Net based on adaptive tiny target detection and lightweight networks is proposed. This method aims to reduce model complexity while fully capturing target features for accurate detection. ATD<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>Net consists of three core modules, Channel Reconstruction Perception Module (CRPM), Dynamic Spatial Self-locking Module (DSSM) and Structural Feedback Optimization Module (SFOM). CRPM reconfigures channels and reconstructs them into batch dimensions, uses parallel strategies to perceive interactive features between channels, and significantly enhances feature extraction capabilities. DSSM adopts dynamic locking and adaptive spatial selection mechanisms to capture multi-scale features while injecting adaptive spatial information. It effectively aggregates instance features and reduces the interference of conflicting information and background information, thereby improving the detection ability of tiny targets. SFOM uses dependency trees to model inter-layer relationships and integrate coupling parameters into groupings. It uses a sparse strategy to remove unimportant parameters, achieving lightweight modeling while balancing accuracy and speed. Experimental results show that ATD<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>Net is superior to the latest methods in driver distraction detection, showing excellent performance and good application prospects.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"138 ","pages":"Article 117342"},"PeriodicalIF":2.7000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Driver distraction detection based on adaptive tiny targets and lightweight networks\",\"authors\":\"Shuangshuang Gu , Bin Wen , Shiyao Chen , Yuanyuan Li , Guanqiu Qi , Linhong Shuai , Zhiqin Zhu\",\"doi\":\"10.1016/j.image.2025.117342\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Driver distraction detection is critical to reducing road traffic accidents and increasing the efficiency of advanced driver assistance systems. Real-time lightweight models are especially important for in-vehicle devices with limited computing resources. However, most existing methods focus on designing lighter network architectures and ignore the performance loss when detecting tiny targets. In order to realize the collaborative optimization of tiny target detection accuracy and network lightweight, a driver distraction detection method ATD<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>Net based on adaptive tiny target detection and lightweight networks is proposed. This method aims to reduce model complexity while fully capturing target features for accurate detection. ATD<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>Net consists of three core modules, Channel Reconstruction Perception Module (CRPM), Dynamic Spatial Self-locking Module (DSSM) and Structural Feedback Optimization Module (SFOM). CRPM reconfigures channels and reconstructs them into batch dimensions, uses parallel strategies to perceive interactive features between channels, and significantly enhances feature extraction capabilities. DSSM adopts dynamic locking and adaptive spatial selection mechanisms to capture multi-scale features while injecting adaptive spatial information. It effectively aggregates instance features and reduces the interference of conflicting information and background information, thereby improving the detection ability of tiny targets. SFOM uses dependency trees to model inter-layer relationships and integrate coupling parameters into groupings. It uses a sparse strategy to remove unimportant parameters, achieving lightweight modeling while balancing accuracy and speed. Experimental results show that ATD<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>Net is superior to the latest methods in driver distraction detection, showing excellent performance and good application prospects.</div></div>\",\"PeriodicalId\":49521,\"journal\":{\"name\":\"Signal Processing-Image Communication\",\"volume\":\"138 \",\"pages\":\"Article 117342\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal Processing-Image Communication\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0923596525000888\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing-Image Communication","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0923596525000888","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

驾驶员分心检测对于减少道路交通事故和提高先进驾驶员辅助系统的效率至关重要。实时轻量级模型对于计算资源有限的车载设备尤为重要。然而，现有的方法大多侧重于设计轻量级的网络架构，忽略了检测微小目标时的性能损失。为了实现微小目标检测精度和网络轻量化的协同优化，提出了一种基于自适应微小目标检测和轻量化网络的驾驶员分心检测方法ATD2Net。该方法旨在降低模型复杂度的同时，充分捕捉目标特征，实现准确检测。ATD2Net由三个核心模块组成：信道重建感知模块（CRPM）、动态空间自锁模块（DSSM）和结构反馈优化模块（sfm）。CRPM对通道进行重新配置并重构成批处理维度，利用并行策略感知通道间的交互特征，显著增强了特征提取能力。DSSM采用动态锁定和自适应空间选择机制捕获多尺度特征，同时注入自适应空间信息。它有效地聚合了实例特征，减少了冲突信息和背景信息的干扰，从而提高了对微小目标的检测能力。sfm使用依赖树对层间关系建模，并将耦合参数集成到分组中。它使用稀疏策略去除不重要的参数，在平衡精度和速度的同时实现轻量级建模。实验结果表明，ATD2Net在驾驶员分心检测方面优于最新方法，表现出优异的性能和良好的应用前景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Driver distraction detection based on adaptive tiny targets and lightweight networks

Driver distraction detection is critical to reducing road traffic accidents and increasing the efficiency of advanced driver assistance systems. Real-time lightweight models are especially important for in-vehicle devices with limited computing resources. However, most existing methods focus on designing lighter network architectures and ignore the performance loss when detecting tiny targets. In order to realize the collaborative optimization of tiny target detection accuracy and network lightweight, a driver distraction detection method ATD

^{2}

Net based on adaptive tiny target detection and lightweight networks is proposed. This method aims to reduce model complexity while fully capturing target features for accurate detection. ATD

^{2}

Net consists of three core modules, Channel Reconstruction Perception Module (CRPM), Dynamic Spatial Self-locking Module (DSSM) and Structural Feedback Optimization Module (SFOM). CRPM reconfigures channels and reconstructs them into batch dimensions, uses parallel strategies to perceive interactive features between channels, and significantly enhances feature extraction capabilities. DSSM adopts dynamic locking and adaptive spatial selection mechanisms to capture multi-scale features while injecting adaptive spatial information. It effectively aggregates instance features and reduces the interference of conflicting information and background information, thereby improving the detection ability of tiny targets. SFOM uses dependency trees to model inter-layer relationships and integrate coupling parameters into groupings. It uses a sparse strategy to remove unimportant parameters, achieving lightweight modeling while balancing accuracy and speed. Experimental results show that ATD

^{2}

Net is superior to the latest methods in driver distraction detection, showing excellent performance and good application prospects.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Signal Processing-Image Communication 工程技术-工程：电子与电气

CiteScore

8.40

自引率

2.90%

发文量

138

审稿时长

5.2 months

期刊介绍： Signal Processing: Image Communication is an international journal for the development of the theory and practice of image communication. Its primary objectives are the following: To present a forum for the advancement of theory and practice of image communication. To stimulate cross-fertilization between areas similar in nature which have traditionally been separated, for example, various aspects of visual communications and information systems. To contribute to a rapid information exchange between the industrial and academic environments. The editorial policy and the technical content of the journal are the responsibility of the Editor-in-Chief, the Area Editors and the Advisory Editors. The Journal is self-supporting from subscription income and contains a minimum amount of advertisements. Advertisements are subject to the prior approval of the Editor-in-Chief. The journal welcomes contributions from every country in the world. Signal Processing: Image Communication publishes articles relating to aspects of the design, implementation and use of image communication systems. The journal features original research work, tutorial and review articles, and accounts of practical developments. Subjects of interest include image/video coding, 3D video representations and compression, 3D graphics and animation compression, HDTV and 3DTV systems, video adaptation, video over IP, peer-to-peer video networking, interactive visual communication, multi-user video conferencing, wireless video broadcasting and communication, visual surveillance, 2D and 3D image/video quality measures, pre/post processing, video restoration and super-resolution, multi-camera video analysis, motion analysis, content-based image/video indexing and retrieval, face and gesture processing, video synthesis, 2D and 3D image/video acquisition and display technologies, architectures for image/video processing and communication.