TwinLiteNet+：用于自动驾驶的增强型多任务分割模型

IF 4.9 3区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computers & Electrical Engineering Pub Date : 2025-09-20 DOI:10.1016/j.compeleceng.2025.110694

Quang-Huy Che, Duc-Tri Le, Minh-Quan Pham, Vinh-Tiep Nguyen, Duc-Khai Lam

{"title":"TwinLiteNet+：用于自动驾驶的增强型多任务分割模型","authors":"Quang-Huy Che, Duc-Tri Le, Minh-Quan Pham, Vinh-Tiep Nguyen, Duc-Khai Lam","doi":"10.1016/j.compeleceng.2025.110694","DOIUrl":null,"url":null,"abstract":"<div><div>Semantic segmentation is a fundamental perception task in autonomous driving, particularly for identifying drivable areas and lane markings to enable safe navigation. However, most state-of-the-art (SOTA) models are computationally intensive and unsuitable for real-time deployment on resource-constrained embedded devices. In this paper, we introduce TwinLiteNet<span><math><msup><mrow></mrow><mrow><mo>+</mo></mrow></msup></math></span>, an enhanced multi-task segmentation model designed for real-time drivable area and lane segmentation with high efficiency. TwinLiteNet<span><math><msup><mrow></mrow><mrow><mo>+</mo></mrow></msup></math></span> employs a hybrid encoder architecture that integrates stride-based dilated convolutions and depthwise separable dilated convolutions, balancing representational capacity and computational cost. To improve task-specific decoding, we propose two lightweight upsampling modules-Upper Convolution Block (UCB) and Upper Simple Block (USB)-alongside a Partial Class Activation Attention (PCAA) mechanism that enhances segmentation precision. The model is available in four configurations, ranging from the ultra-compact TwinLiteNet<span><math><msubsup><mrow></mrow><mrow><mtext>Nano</mtext></mrow><mrow><mo>+</mo></mrow></msubsup></math></span> (34K parameters) to the high-performance TwinLiteNet<span><math><msubsup><mrow></mrow><mrow><mtext>Large</mtext></mrow><mrow><mo>+</mo></mrow></msubsup></math></span> (1.94M parameters). On the BDD100K dataset (Yu et al. (2020)), TwinLiteNet<span><math><msubsup><mrow></mrow><mrow><mtext>Large</mtext></mrow><mrow><mo>+</mo></mrow></msubsup></math></span> achieves 92.9% mIoU for drivable area segmentation and 34.2% IoU for lane segmentation-surpassing existing state-of-the-art models while requiring 11<span><math><mo>×</mo></math></span> fewer floating-point operations (FLOPs) for computation. The results compared with other models are shown in <span><span>Fig. 1</span></span>. Extensive evaluations on embedded devices demonstrate superior inference speed, quantization robustness (INT8/FP16), and energy efficiency, validating TwinLiteNet<span><math><msup><mrow></mrow><mrow><mo>+</mo></mrow></msup></math></span> as a compelling solution for real-world autonomous driving systems. Code is available at <span><span>https://github.com/chequanghuy/TwinLiteNetPlus</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"128 ","pages":"Article 110694"},"PeriodicalIF":4.9000,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TwinLiteNet+: An enhanced multi-task segmentation model for autonomous driving\",\"authors\":\"Quang-Huy Che, Duc-Tri Le, Minh-Quan Pham, Vinh-Tiep Nguyen, Duc-Khai Lam\",\"doi\":\"10.1016/j.compeleceng.2025.110694\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Semantic segmentation is a fundamental perception task in autonomous driving, particularly for identifying drivable areas and lane markings to enable safe navigation. However, most state-of-the-art (SOTA) models are computationally intensive and unsuitable for real-time deployment on resource-constrained embedded devices. In this paper, we introduce TwinLiteNet<span><math><msup><mrow></mrow><mrow><mo>+</mo></mrow></msup></math></span>, an enhanced multi-task segmentation model designed for real-time drivable area and lane segmentation with high efficiency. TwinLiteNet<span><math><msup><mrow></mrow><mrow><mo>+</mo></mrow></msup></math></span> employs a hybrid encoder architecture that integrates stride-based dilated convolutions and depthwise separable dilated convolutions, balancing representational capacity and computational cost. To improve task-specific decoding, we propose two lightweight upsampling modules-Upper Convolution Block (UCB) and Upper Simple Block (USB)-alongside a Partial Class Activation Attention (PCAA) mechanism that enhances segmentation precision. The model is available in four configurations, ranging from the ultra-compact TwinLiteNet<span><math><msubsup><mrow></mrow><mrow><mtext>Nano</mtext></mrow><mrow><mo>+</mo></mrow></msubsup></math></span> (34K parameters) to the high-performance TwinLiteNet<span><math><msubsup><mrow></mrow><mrow><mtext>Large</mtext></mrow><mrow><mo>+</mo></mrow></msubsup></math></span> (1.94M parameters). On the BDD100K dataset (Yu et al. (2020)), TwinLiteNet<span><math><msubsup><mrow></mrow><mrow><mtext>Large</mtext></mrow><mrow><mo>+</mo></mrow></msubsup></math></span> achieves 92.9% mIoU for drivable area segmentation and 34.2% IoU for lane segmentation-surpassing existing state-of-the-art models while requiring 11<span><math><mo>×</mo></math></span> fewer floating-point operations (FLOPs) for computation. The results compared with other models are shown in <span><span>Fig. 1</span></span>. Extensive evaluations on embedded devices demonstrate superior inference speed, quantization robustness (INT8/FP16), and energy efficiency, validating TwinLiteNet<span><math><msup><mrow></mrow><mrow><mo>+</mo></mrow></msup></math></span> as a compelling solution for real-world autonomous driving systems. Code is available at <span><span>https://github.com/chequanghuy/TwinLiteNetPlus</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"128 \",\"pages\":\"Article 110694\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790625006378\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625006378","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

语义分割是自动驾驶的一项基本感知任务，特别是用于识别可驾驶区域和车道标记以实现安全导航。然而，大多数最先进的（SOTA）模型是计算密集型的，不适合在资源受限的嵌入式设备上实时部署。本文介绍了TwinLiteNet+，一种增强型多任务分割模型，用于实时高效地分割可行驶区域和车道。TwinLiteNet+采用混合编码器架构，集成了基于跨行的扩展卷积和深度可分离的扩展卷积，平衡了表征能力和计算成本。为了改进特定任务的解码，我们提出了两个轻量级上采样模块-上卷积块（UCB）和上简单块(USB)-以及部分类激活注意（PCAA）机制，以提高分割精度。该型号有四种配置，从超紧凑的TwinLiteNetNano+ （34K参数）到高性能的TwinLiteNetLarge+ （1.94M参数）。在BDD100K数据集（Yu et al.(2020)）上，TwinLiteNetLarge+在可驾驶区域分割方面实现了92.9%的mIoU，在车道分割方面实现了34.2%的mIoU，超过了现有的最先进模型，同时需要的浮点运算（FLOPs）减少了11倍。与其他模型的对比结果如图1所示。对嵌入式设备的广泛评估表明，TwinLiteNet+具有卓越的推理速度、量化鲁棒性（INT8/FP16）和能效，验证了TwinLiteNet+作为现实世界自动驾驶系统的引人注目的解决方案。代码可从https://github.com/chequanghuy/TwinLiteNetPlus获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

TwinLiteNet+: An enhanced multi-task segmentation model for autonomous driving

Semantic segmentation is a fundamental perception task in autonomous driving, particularly for identifying drivable areas and lane markings to enable safe navigation. However, most state-of-the-art (SOTA) models are computationally intensive and unsuitable for real-time deployment on resource-constrained embedded devices. In this paper, we introduce TwinLiteNet

^{+}

, an enhanced multi-task segmentation model designed for real-time drivable area and lane segmentation with high efficiency. TwinLiteNet

^{+}

employs a hybrid encoder architecture that integrates stride-based dilated convolutions and depthwise separable dilated convolutions, balancing representational capacity and computational cost. To improve task-specific decoding, we propose two lightweight upsampling modules-Upper Convolution Block (UCB) and Upper Simple Block (USB)-alongside a Partial Class Activation Attention (PCAA) mechanism that enhances segmentation precision. The model is available in four configurations, ranging from the ultra-compact TwinLiteNet

_{Nano}^{+}

(34K parameters) to the high-performance TwinLiteNet

_{Large}^{+}

(1.94M parameters). On the BDD100K dataset (Yu et al. (2020)), TwinLiteNet

_{Large}^{+}

achieves 92.9% mIoU for drivable area segmentation and 34.2% IoU for lane segmentation-surpassing existing state-of-the-art models while requiring 11

\times

fewer floating-point operations (FLOPs) for computation. The results compared with other models are shown in Fig. 1. Extensive evaluations on embedded devices demonstrate superior inference speed, quantization robustness (INT8/FP16), and energy efficiency, validating TwinLiteNet

^{+}

as a compelling solution for real-world autonomous driving systems. Code is available at https://github.com/chequanghuy/TwinLiteNetPlus.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Electrical Engineering 工程技术-工程：电子与电气

CiteScore

9.20

自引率

7.00%

发文量

661

审稿时长

47 days

期刊介绍： The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.