LWTD：用于驾驶场景去毛刺的新型轻量级变压器式 CNN 架构

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics Pub Date : 2024-09-02 DOI:10.1007/s13042-024-02335-9

Zhenbo Zhang, Zhiguo Feng, Aiqi Long, Zhiyu Wang

{"title":"LWTD：用于驾驶场景去毛刺的新型轻量级变压器式 CNN 架构","authors":"Zhenbo Zhang, Zhiguo Feng, Aiqi Long, Zhiyu Wang","doi":"10.1007/s13042-024-02335-9","DOIUrl":null,"url":null,"abstract":"<p>With the rapid advancement of artificial intelligence and automation technology, interest in autonomous driving research is also growing. However, under heavy rain, fog, and other adverse weather conditions, the visual quality of the images is reduced due to suspended atmospheric particles that affect the vehicle’s visual perception system, which is not conducive to the autonomous driving system’s accurate perception of the road environment. To address these challenges, this article presents a computationally efficient end-to-end light-weight Transformer-like neural network called LWTD (Light-Weight Transformer-like DehazeNet) to reconstruct haze-free images for driving tasks, which based on the reformulated ASM theory without prior knowledge. First, a strategy for simplifying the atmospheric light and transmission map into a feature map is adopted, a CMT (Convolutional Mapping Transformer) module for the extraction of global features is developed, and the hazy image is decomposed into a base layer (global features) and a detail layer (local features) for Low-Level, Medium-Level, and High-Level stages. Meanwhile, a channel attention module is introduced to weigh and assign the weights of each feature, and to fuse them with the reformulated ASM (Atmospheric Scattering Model) model to restore the haze-free image. Second, a joint loss function of the graphical features is formulated to further direct the network to converge in the direction of abundant features. In addition, a dataset of real-world fog driving is constructed. Extensive experiments with synthetic and natural hazy images confirmed the superiority of the proposed method through quantitative and qualitative evaluations on various datasets. Furthermore, additional experiments validated the applicability of the proposed method for traffic participant detection and semantic segmentation tasks. The source code has been made publicly available on https://github.com/ZebGH/LWTD-Net.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"73 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LWTD: a novel light-weight transformer-like CNN architecture for driving scene dehazing\",\"authors\":\"Zhenbo Zhang, Zhiguo Feng, Aiqi Long, Zhiyu Wang\",\"doi\":\"10.1007/s13042-024-02335-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>With the rapid advancement of artificial intelligence and automation technology, interest in autonomous driving research is also growing. However, under heavy rain, fog, and other adverse weather conditions, the visual quality of the images is reduced due to suspended atmospheric particles that affect the vehicle’s visual perception system, which is not conducive to the autonomous driving system’s accurate perception of the road environment. To address these challenges, this article presents a computationally efficient end-to-end light-weight Transformer-like neural network called LWTD (Light-Weight Transformer-like DehazeNet) to reconstruct haze-free images for driving tasks, which based on the reformulated ASM theory without prior knowledge. First, a strategy for simplifying the atmospheric light and transmission map into a feature map is adopted, a CMT (Convolutional Mapping Transformer) module for the extraction of global features is developed, and the hazy image is decomposed into a base layer (global features) and a detail layer (local features) for Low-Level, Medium-Level, and High-Level stages. Meanwhile, a channel attention module is introduced to weigh and assign the weights of each feature, and to fuse them with the reformulated ASM (Atmospheric Scattering Model) model to restore the haze-free image. Second, a joint loss function of the graphical features is formulated to further direct the network to converge in the direction of abundant features. In addition, a dataset of real-world fog driving is constructed. Extensive experiments with synthetic and natural hazy images confirmed the superiority of the proposed method through quantitative and qualitative evaluations on various datasets. Furthermore, additional experiments validated the applicability of the proposed method for traffic participant detection and semantic segmentation tasks. The source code has been made publicly available on https://github.com/ZebGH/LWTD-Net.</p>\",\"PeriodicalId\":51327,\"journal\":{\"name\":\"International Journal of Machine Learning and Cybernetics\",\"volume\":\"73 1\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Machine Learning and Cybernetics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s13042-024-02335-9\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02335-9","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

随着人工智能和自动化技术的飞速发展，人们对自动驾驶研究的兴趣也与日俱增。然而，在大雨、大雾等恶劣天气条件下，由于悬浮的大气颗粒会影响车辆的视觉感知系统，导致图像的视觉质量下降，不利于自动驾驶系统准确感知道路环境。为了应对这些挑战，本文提出了一种计算高效的端到端轻量级类变形器神经网络，称为 LWTD（Light-Weight Transformer-like DehazeNet），用于重建无雾霾图像以完成驾驶任务，该网络基于重构的 ASM 理论，无需先验知识。首先，采用将大气光和透射图简化为特征图的策略，开发了用于提取全局特征的 CMT（卷积映射变换器）模块，并将雾霾图像分解为低层、中层和高层阶段的基础层（全局特征）和细节层（局部特征）。同时，引入通道关注模块来权衡和分配每个特征的权重，并将其与重新制定的 ASM（大气散射模型）模型融合，以还原无雾霾图像。其次，制定了图形特征的联合损失函数，进一步引导网络向丰富特征的方向收敛。此外，还构建了一个真实世界雾驾驶数据集。通过对各种数据集进行定量和定性评估，利用合成图像和自然雾霾图像进行的大量实验证实了所提方法的优越性。此外，其他实验也验证了所提方法在交通参与者检测和语义分割任务中的适用性。源代码已在 https://github.com/ZebGH/LWTD-Net 上公开。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

LWTD: a novel light-weight transformer-like CNN architecture for driving scene dehazing

查看原文本刊更多论文

LWTD: a novel light-weight transformer-like CNN architecture for driving scene dehazing

With the rapid advancement of artificial intelligence and automation technology, interest in autonomous driving research is also growing. However, under heavy rain, fog, and other adverse weather conditions, the visual quality of the images is reduced due to suspended atmospheric particles that affect the vehicle’s visual perception system, which is not conducive to the autonomous driving system’s accurate perception of the road environment. To address these challenges, this article presents a computationally efficient end-to-end light-weight Transformer-like neural network called LWTD (Light-Weight Transformer-like DehazeNet) to reconstruct haze-free images for driving tasks, which based on the reformulated ASM theory without prior knowledge. First, a strategy for simplifying the atmospheric light and transmission map into a feature map is adopted, a CMT (Convolutional Mapping Transformer) module for the extraction of global features is developed, and the hazy image is decomposed into a base layer (global features) and a detail layer (local features) for Low-Level, Medium-Level, and High-Level stages. Meanwhile, a channel attention module is introduced to weigh and assign the weights of each feature, and to fuse them with the reformulated ASM (Atmospheric Scattering Model) model to restore the haze-free image. Second, a joint loss function of the graphical features is formulated to further direct the network to converge in the direction of abundant features. In addition, a dataset of real-world fog driving is constructed. Extensive experiments with synthetic and natural hazy images confirmed the superiority of the proposed method through quantitative and qualitative evaluations on various datasets. Furthermore, additional experiments validated the applicability of the proposed method for traffic participant detection and semantic segmentation tasks. The source code has been made publicly available on https://github.com/ZebGH/LWTD-Net.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Machine Learning and Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

7.90

自引率

10.70%

发文量

225

期刊介绍： Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data. The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC. Key research areas to be covered by the journal include: Machine Learning for modeling interactions between systems Pattern Recognition technology to support discovery of system-environment interaction Control of system-environment interactions Biochemical interaction in biological and biologically-inspired systems Learning for improvement of communication schemes between systems