IEEE Embedded Systems Letters最新文献

Safety-Driven DNN Sizing for Vehicular CPS 基于安全驱动的车辆CPS DNN分级

IF 2 4区计算机科学

IEEE Embedded Systems Letters Pub Date : 2026-04-01 Epub Date: 2025-08-04 DOI: 10.1109/LES.2025.3595839

Tingan Zhu;Mier Li;Bineet Ghosh;Samarjit Chakraborty;Parasara Sridhar Duggirala

{"title":"Safety-Driven DNN Sizing for Vehicular CPS","authors":"Tingan Zhu;Mier Li;Bineet Ghosh;Samarjit Chakraborty;Parasara Sridhar Duggirala","doi":"10.1109/LES.2025.3595839","DOIUrl":"https://doi.org/10.1109/LES.2025.3595839","url":null,"abstract":"Perception processing in cyber–physical systems (CPSs) is now almost exclusively done using deep neural networks (DNNs). Here, camera, radar, and LiDAR data—in autonomous vehicles or robots—is fed into DNNs that detect surrounding obstacles and distances to them. These results are used by controllers to compute appropriate actuation signals. But a CPS typically has multiple state components, where each of them might be estimated using a different camera, radar or LiDAR and an associated DNN. Hence, an emerging problem is to implement multiple DNNs on a resource-constrained graphics processing unit (GPU). While many GPUs from NVIDIA and AMD allow them to be split into multiple virtual GPUs, there is little work on how to partition them, and therefore size the corresponding DNNs, when they are a part of the same CPS. In contrast to the existing practice of focusing on the inference accuracy of individual DNNs in isolation, we propose a system-level safety-driven DNN sizing (and hence GPU partitioning) scheme for vehicular CPS. Our main technical contribution is a detailed experimental evaluation of this DNN sizing approach and an empirical validation of the formal technique behind it.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"18 2","pages":"164-167"},"PeriodicalIF":2.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147737036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Memory Representation of Random Forests Optimized for Resource-Limited Embedded Devices 一种针对资源有限的嵌入式设备优化的随机森林内存表示

IF 2 4区计算机科学

IEEE Embedded Systems Letters Pub Date : 2026-04-01 Epub Date: 2025-03-28 DOI: 10.1109/LES.2025.3574563

Justin Beaurivage;Messaoud Ahmed Ouameur;Frédéric Domingue

{"title":"A Memory Representation of Random Forests Optimized for Resource-Limited Embedded Devices","authors":"Justin Beaurivage;Messaoud Ahmed Ouameur;Frédéric Domingue","doi":"10.1109/LES.2025.3574563","DOIUrl":"https://doi.org/10.1109/LES.2025.3574563","url":null,"abstract":"Random forests (RFs) are a versatile and effective machine learning technique widely applied across various tasks. With the increasing demand for deploying machine learning models on resource-constrained embedded devices, such as microcontrollers, challenges arise from the growing complexity of modern datasets. These challenges often result in models that are too large in memory and storage requirements to be feasibly implemented on small devices. In this letter, we propose a lossless memory representation of RFs that significantly limits the amount of random-access memory (RAM) required for prediction tasks, while also reducing the amount of nonvolatile memory needed to store the model. The approach achieves efficiency by embedding the data of leaf nodes within the decision nodes, thereby streamlining the tree structure. Additionally, it supports in-place prediction without requiring a decompression step. To evaluate our method, we implemented four RFs derived from real-world datasets onto four microcontroller platforms. Our results demonstrate that prediction tasks can be performed using at most 144 bytes of RAM for classification tasks, and at most 48 bytes for regression tasks, while memory accesses account for a maximum of 27.0% of the total CPU cycles. On the fastest platform, prediction times ranged between 59 and <inline-formula> <tex-math>$75~mu $ </tex-math></inline-formula>s, highlighting the suitability of this method for a variety of real-time applications.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"18 2","pages":"115-118"},"PeriodicalIF":2.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147737052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Padel: Priority-Based Real-Time Scheduling for GPUs 选项：基于优先级的gpu实时调度

IF 2 4区计算机科学

IEEE Embedded Systems Letters Pub Date : 2026-04-01 Epub Date: 2025-07-15 DOI: 10.1109/LES.2025.3589370

Atiyeh Gheibi-Fetrat;Sepideh Safari;Amirsaeed Ahmadi-Tonekaboni;Shaahin Hessabi;Hamid Sarbazi-Azad

引用次数: 0

FPGA-Based Real-Time Multi-Class Vehicle Classification Using mmWave Radar 基于fpga的毫米波雷达实时多类车辆分类

IF 2 4区计算机科学

IEEE Embedded Systems Letters Pub Date : 2026-04-01 Epub Date: 2025-07-04 DOI: 10.1109/LES.2025.3586098

Anand Mohan;Hemant Kumar Meena;Mohd Wajid;Abhishek Srivastava

{"title":"FPGA-Based Real-Time Multi-Class Vehicle Classification Using mmWave Radar","authors":"Anand Mohan;Hemant Kumar Meena;Mohd Wajid;Abhishek Srivastava","doi":"10.1109/LES.2025.3586098","DOIUrl":"https://doi.org/10.1109/LES.2025.3586098","url":null,"abstract":"The present study introduces field-programmable gate array (FPGA)-based Real-Time multiclass vehicle classification using millimeter wave radar (mmWave radar), which overcomes the limitations of conventional sensors such as LiDAR and cameras, which are sensitive to adverse weather and lighting conditions. On a hardware-software platform, the implementation of multiclass vehicle classification demonstrated its effectiveness. Within the realm of multiclass vehicle classification applications, the FPGA-based PYNQ-ZU (Python Productivity for Zynq) serves as an efficient embedded architecture. The reliability and accuracy of this method are improved, rendering it a promising solution for autonomous vehicles and advanced driver assistance systems (ADASs) in a variety of driving scenarios. We employed 3-D point cloud data produced by mmWave radar via a PC, then transformed it into 2-D point cloud images by top-view filtration methods. This method demonstrated greater efficacy in feature extraction with VGG-16. Multiple machine learning models were employed for classification tasks on both hardware and software platforms, achieving 100% accuracy with the random forest (RF) algorithm.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"18 2","pages":"132-135"},"PeriodicalIF":2.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147737035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Zero-Shot Fused Attention-Based GPT-2 Accelerator for Resource-Constrained Embedded Platform 基于零射击融合注意力的GPT-2资源受限嵌入式平台加速器

IF 2 4区计算机科学

IEEE Embedded Systems Letters Pub Date : 2026-04-01 Epub Date: 2025-06-27 DOI: 10.1109/LES.2025.3583743

Abhishek Yadav;Ayush Dixit;Binod Kumar

{"title":"Zero-Shot Fused Attention-Based GPT-2 Accelerator for Resource-Constrained Embedded Platform","authors":"Abhishek Yadav;Ayush Dixit;Binod Kumar","doi":"10.1109/LES.2025.3583743","DOIUrl":"https://doi.org/10.1109/LES.2025.3583743","url":null,"abstract":"This letter proposes a hardware-software co-design approach to accelerate inference of generative pretrained Transformer (GPT-2) for resource-constrained embedded applications. Essentially, a standard configuration of GPT-2 (Python-based software implementation) is redefined with high-level language (C++) to ultimately design a dedicated and optimized hardware logic of GPT-2 as an IP core, taking resources available on the ZCU104 Field-programmable gate array (FPGA) board into account. The approach leverages a zero-shot learning setup, buffer tiling, and compiler directives for implementing a fused attention-based GPT-2 architecture with ZCU104, ensuring maximum computational power is effectively squeezed from available resources and tradeoff between throughput (samples/second), power consumption (W), energy efficiency (mJ), and resource utilization is balanced. The proposed optimizations improve throughput by 25.3% (from 67.11 to 84.03 samples/sec) compared to the baseline. Moreover, a comprehensive investigation of the proposed optimization is done by leveraging the impact of layer fusion on latency, utilization, and throughput. Also, the generalizability of the proposed approach is validated by implementing various configurations of GPT-2. Codes and subsequent files are available at GitHub Repository.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"18 2","pages":"107-110"},"PeriodicalIF":2.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147737055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Quantitative Security Ranking Method of PUF Based on the Rademacher Complexity of PUFs 基于PUF Rademacher复杂度的PUF定量安全排序方法

IF 2 4区计算机科学

IEEE Embedded Systems Letters Pub Date : 2026-04-01 Epub Date: 2025-07-07 DOI: 10.1109/LES.2025.3586724

Xuexiang Deng;Xiaole Cui;Xing Zhang

引用次数: 0

Compressing Runtime Memory Usage via Activation Remapping for Deploying Deep Neural Networks on MCUs 通过激活重映射压缩运行时内存使用，在mcu上部署深度神经网络

IF 2 4区计算机科学

IEEE Embedded Systems Letters Pub Date : 2026-04-01 Epub Date: 2025-03-20 DOI: 10.1109/LES.2025.3571799

Jinyu Zhan;Xiang Wang;Wei Jiang;Suidi Peng

引用次数: 0

IEEE Embedded Systems Letters Publication Information IEEE嵌入式系统通讯出版信息

IF 2 4区计算机科学

IEEE Embedded Systems Letters Pub Date : 2026-04-01 Epub Date: 2026-04-20 DOI: 10.1109/LES.2026.3680170

引用次数: 0

Design of Approximate Floating-Point Arithmetic Units Using Hardware-Efficient Rounding Schemes 基于硬件高效舍入方案的近似浮点算术单元设计

IF 2 4区计算机科学

IEEE Embedded Systems Letters Pub Date : 2026-04-01 Epub Date: 2025-07-30 DOI: 10.1109/LES.2025.3593921

Myeongjin Kwak;Seokhyeon Lee;Yongtae Kim

引用次数: 0

Design and Implementation of RISC-V-Based SoC for Electric Vehicle Traction Application 基于risc - v的电动汽车牵引SoC的设计与实现

IF 2 4区计算机科学

IEEE Embedded Systems Letters Pub Date : 2026-04-01 Epub Date: 2025-07-31 DOI: 10.1109/LES.2025.3594596

G. Renjith;C. V. Raghu

引用次数: 0