FFCANet: a frequency channel fusion coordinate attention mechanism network for lane detection

The Visual Computer Pub Date : 2024-09-16 DOI:10.1007/s00371-024-03626-6

Shijie Li, Shanhua Yao, Zhonggen Wang, Juan Wu

{"title":"FFCANet: a frequency channel fusion coordinate attention mechanism network for lane detection","authors":"Shijie Li, Shanhua Yao, Zhonggen Wang, Juan Wu","doi":"10.1007/s00371-024-03626-6","DOIUrl":null,"url":null,"abstract":"<p>Lane line detection becomes a challenging task in complex and dynamic driving scenarios. Addressing the limitations of existing lane line detection algorithms, which struggle to balance accuracy and efficiency in complex and changing traffic scenarios, a frequency channel fusion coordinate attention mechanism network (FFCANet) for lane detection is proposed. A residual neural network (ResNet) is used as a feature extraction backbone network. We propose a feature enhancement method with a frequency channel fusion coordinate attention mechanism (FFCA) that captures feature information from different spatial orientations and then uses multiple frequency components to extract detail and texture features of lane lines. A row-anchor-based prediction and classification method treats lane line detection as a problem of selecting lane marking anchors within row-oriented cells predefined by global features, which greatly improves the detection speed and can handle visionless driving scenarios. Additionally, an efficient channel attention (ECA) module is integrated into the auxiliary segmentation branch to capture dynamic dependencies between channels, further enhancing feature extraction capabilities. The performance of the model is evaluated on two publicly available datasets, TuSimple and CULane. Simulation results demonstrate that the average processing time per image frame is 5.0 ms, with an accuracy of 96.09% on the TuSimple dataset and an F1 score of 72.8% on the CULane dataset. The model exhibits excellent robustness in detecting complex scenes while effectively balancing detection accuracy and speed. The source code is available at https://github.com/lsj1012/FFCANet/tree/master</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03626-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Lane line detection becomes a challenging task in complex and dynamic driving scenarios. Addressing the limitations of existing lane line detection algorithms, which struggle to balance accuracy and efficiency in complex and changing traffic scenarios, a frequency channel fusion coordinate attention mechanism network (FFCANet) for lane detection is proposed. A residual neural network (ResNet) is used as a feature extraction backbone network. We propose a feature enhancement method with a frequency channel fusion coordinate attention mechanism (FFCA) that captures feature information from different spatial orientations and then uses multiple frequency components to extract detail and texture features of lane lines. A row-anchor-based prediction and classification method treats lane line detection as a problem of selecting lane marking anchors within row-oriented cells predefined by global features, which greatly improves the detection speed and can handle visionless driving scenarios. Additionally, an efficient channel attention (ECA) module is integrated into the auxiliary segmentation branch to capture dynamic dependencies between channels, further enhancing feature extraction capabilities. The performance of the model is evaluated on two publicly available datasets, TuSimple and CULane. Simulation results demonstrate that the average processing time per image frame is 5.0 ms, with an accuracy of 96.09% on the TuSimple dataset and an F1 score of 72.8% on the CULane dataset. The model exhibits excellent robustness in detecting complex scenes while effectively balancing detection accuracy and speed. The source code is available at https://github.com/lsj1012/FFCANet/tree/master

Abstract Image

查看原文本刊更多论文

FFCANet：用于车道检测的频率信道融合协调注意机制网络

在复杂多变的驾驶场景中，车道线检测成为一项具有挑战性的任务。针对现有车道线检测算法在复杂多变的交通场景中难以兼顾准确性和效率的局限性，提出了一种用于车道检测的频率信道融合协调注意力机制网络（FFCANet）。残差神经网络（ResNet）被用作特征提取骨干网络。我们提出了一种采用频率信道融合坐标注意机制（FFCA）的特征增强方法，该方法可捕获来自不同空间方向的特征信息，然后使用多个频率分量来提取车道线的细节和纹理特征。基于行锚点的预测和分类方法将车道线检测视为在全局特征预定义的面向行的单元内选择车道标记锚点的问题，从而大大提高了检测速度，并可处理无视觉驾驶场景。此外，在辅助分割分支中还集成了高效通道关注（ECA）模块，以捕捉通道之间的动态依赖关系，从而进一步提高特征提取能力。该模型的性能在两个公开的数据集（TuSimple 和 CULane）上进行了评估。仿真结果表明，每帧图像的平均处理时间为 5.0 毫秒，在 TuSimple 数据集上的准确率为 96.09%，在 CULane 数据集上的 F1 分数为 72.8%。该模型在检测复杂场景时表现出卓越的鲁棒性，同时有效平衡了检测精度和速度。源代码见 https://github.com/lsj1012/FFCANet/tree/master

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The Visual Computer

自引率

0.00%

发文量