Lightweight path aggregation network for pedestrian detection on FPGA board

IF 3.4 3区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

Journal of Parallel and Distributed Computing Pub Date : 2025-06-16 DOI:10.1016/j.jpdc.2025.105137

Riadh Ayachi , Mouna Afif , Yahia Said , Abdessalem Ben Abdelali

{"title":"Lightweight path aggregation network for pedestrian detection on FPGA board","authors":"Riadh Ayachi , Mouna Afif , Yahia Said , Abdessalem Ben Abdelali","doi":"10.1016/j.jpdc.2025.105137","DOIUrl":null,"url":null,"abstract":"<div><div>In urban environments, pedestrian safety stands as a pivotal metric dictating the accuracy and efficacy of cutting-edge technologies like Advanced Driver Assistance Systems (ADAS) and autonomous vehicles. However, the deployment of such technologies introduces various constraints, notably including the computational resources of processing boards. Therefore, constructing a robust pedestrian detection system necessitates achieving a delicate balance between performance and computational complexity. In this study, we propose the development of a lightweight Convolutional Neural Network (CNN) model specifically tailored for pedestrian detection. The backbone architecture of the model was meticulously searched using a network search engine predicated on the Multi-Objective Genetic Algorithm (NSGA-II) with a customized strategy. Notably, we shifted the search space from central processing units to Multi-Processor System-on-Chip (MPSoC) devices, aligning with the practical considerations of real-world applications. Our proposed model capitalizes on the path aggregation architecture coupled with a lightweight backbone design. The core concept revolves around the efficient transfer of high semantic features from the network's bottom to its top via the shortest path, thereby enhancing detection rates without introducing undue computational complexity. To ensure compatibility with embedded devices with limited memory, the proposed model underwent compression via quantization and pruning techniques. For rigorous evaluation, we tested the pedestrian detection model on the Xilinx ZCU 102 board, utilizing the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) dataset for training and evaluation purposes. The reported results substantiate the efficacy of our proposed model, boasting a mean average precision (mAP) of 93.6 % alongside a commendable processing speed of 13 frames per second (FPS). These outcomes underscore the suitability of the proposed model for real-life scenarios, wherein ensuring a high level of safety remains paramount.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"204 ","pages":"Article 105137"},"PeriodicalIF":3.4000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Parallel and Distributed Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0743731525001042","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

In urban environments, pedestrian safety stands as a pivotal metric dictating the accuracy and efficacy of cutting-edge technologies like Advanced Driver Assistance Systems (ADAS) and autonomous vehicles. However, the deployment of such technologies introduces various constraints, notably including the computational resources of processing boards. Therefore, constructing a robust pedestrian detection system necessitates achieving a delicate balance between performance and computational complexity. In this study, we propose the development of a lightweight Convolutional Neural Network (CNN) model specifically tailored for pedestrian detection. The backbone architecture of the model was meticulously searched using a network search engine predicated on the Multi-Objective Genetic Algorithm (NSGA-II) with a customized strategy. Notably, we shifted the search space from central processing units to Multi-Processor System-on-Chip (MPSoC) devices, aligning with the practical considerations of real-world applications. Our proposed model capitalizes on the path aggregation architecture coupled with a lightweight backbone design. The core concept revolves around the efficient transfer of high semantic features from the network's bottom to its top via the shortest path, thereby enhancing detection rates without introducing undue computational complexity. To ensure compatibility with embedded devices with limited memory, the proposed model underwent compression via quantization and pruning techniques. For rigorous evaluation, we tested the pedestrian detection model on the Xilinx ZCU 102 board, utilizing the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) dataset for training and evaluation purposes. The reported results substantiate the efficacy of our proposed model, boasting a mean average precision (mAP) of 93.6 % alongside a commendable processing speed of 13 frames per second (FPS). These outcomes underscore the suitability of the proposed model for real-life scenarios, wherein ensuring a high level of safety remains paramount.

查看原文本刊更多论文

基于FPGA板的行人检测轻量级路径聚合网络

在城市环境中，行人安全是决定先进驾驶辅助系统（ADAS）和自动驾驶汽车等尖端技术准确性和有效性的关键指标。然而，这些技术的部署引入了各种限制，特别是包括处理板的计算资源。因此，构建一个鲁棒的行人检测系统需要在性能和计算复杂度之间取得微妙的平衡。在这项研究中，我们提出了一种专门为行人检测量身定制的轻量级卷积神经网络（CNN）模型的开发。使用基于多目标遗传算法（NSGA-II）的网络搜索引擎和定制策略对模型的主干架构进行精细搜索。值得注意的是，我们将搜索空间从中央处理单元转移到多处理器片上系统（MPSoC）设备，以符合实际应用的实际考虑。我们提出的模型利用了路径聚合架构和轻量级主干设计。核心概念围绕着通过最短路径将高语义特征从网络的底部有效地转移到网络的顶部，从而在不引入过度计算复杂性的情况下提高检测率。为了确保与有限内存的嵌入式设备兼容，所提出的模型通过量化和修剪技术进行压缩。为了进行严格的评估，我们在赛灵思ZCU 102板上测试了行人检测模型，利用卡尔斯鲁厄理工学院和丰田理工学院（KITTI）的数据集进行训练和评估。报告的结果证实了我们提出的模型的有效性，拥有93.6%的平均精度（mAP）以及令人称道的每秒13帧（FPS）的处理速度。这些结果强调了所提出的模型对现实场景的适用性，在现实场景中，确保高水平的安全性仍然是至关重要的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Parallel and Distributed Computing 工程技术-计算机：理论方法

CiteScore

10.30

自引率

2.60%

发文量

172

审稿时长

12 months

期刊介绍： This international journal is directed to researchers, engineers, educators, managers, programmers, and users of computers who have particular interests in parallel processing and/or distributed computing. The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems. The journal also features special issues on these topics; again covering the full range from the design to the use of our targeted systems.