FASE：一种基于fpga的蒙特卡罗采样轻量级样本熵加速器

IF 3.1 2区工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-08-06 DOI:10.1109/TVLSI.2025.3593020

Jiayu Liu;Yuanhang Li;Zhengyang Huang;Chao Chen;Ruiqi Chen;Bruno da Silva

{"title":"FASE：一种基于fpga的蒙特卡罗采样轻量级样本熵加速器","authors":"Jiayu Liu;Yuanhang Li;Zhengyang Huang;Chao Chen;Ruiqi Chen;Bruno da Silva","doi":"10.1109/TVLSI.2025.3593020","DOIUrl":null,"url":null,"abstract":"Sample entropy (SampEn) is an algorithm within information entropy that enables effective analysis of biological signals. Due to the need for extensive similarity matching operations, the SampEn calculation process is time-consuming. Although a series of fast SampEn algorithms have been proposed, they remain time-intensive when processing large data volumes. Additionally, previous field-programmable gate array (FPGA)-based hardware accelerators designed for SampEn suffer from architectural design limitations, consuming substantial on-chip memory resources and operating at low frequencies. In this article, we propose FASE, an FPGA-based accelerator for lightweight sample entropy (LW-SampEn) with Monte Carlo (MC) sampling. The FASE design comprises two main parts: algorithm and hardware optimizations. On the algorithmic side, we introduce MC sampling into the merge-sort-based LW-SampEn algorithm, named MCLW-SampEn. MCLW-SampEn effectively reduces the computation load for large data volumes while maintaining algorithmic accuracy. For hardware, we first design efficient sorting and allocation modules to address boundary localization and load imbalance issues in previous accelerator designs. Then, we replicate the computation across the main phases to enable parallel processing. Finally, we deploy the design on the Pynq-Z2 board for validation. Experimental results show that the proposed MCLW-SampEn algorithm achieves an average speed up of <inline-formula> <tex-math>$3\\times $ </tex-math></inline-formula> over the LW-SampEn algorithm, with accuracy losses kept within 0.5%. Compared to state-of-the-art (SOTA) designs, FASE achieves an average speed up of <inline-formula> <tex-math>$12.8\\times $ </tex-math></inline-formula> while reducing power consumption by 89.3%. Ablation studies indicate that, for the same algorithm, FASE offers a <inline-formula> <tex-math>$7.4\\times $ </tex-math></inline-formula> speedup over related FPGA designs.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 10","pages":"2883-2896"},"PeriodicalIF":3.1000,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FASE: An FPGA-Based Accelerator for Lightweight Sample Entropy With Monte Carlo Sampling\",\"authors\":\"Jiayu Liu;Yuanhang Li;Zhengyang Huang;Chao Chen;Ruiqi Chen;Bruno da Silva\",\"doi\":\"10.1109/TVLSI.2025.3593020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sample entropy (SampEn) is an algorithm within information entropy that enables effective analysis of biological signals. Due to the need for extensive similarity matching operations, the SampEn calculation process is time-consuming. Although a series of fast SampEn algorithms have been proposed, they remain time-intensive when processing large data volumes. Additionally, previous field-programmable gate array (FPGA)-based hardware accelerators designed for SampEn suffer from architectural design limitations, consuming substantial on-chip memory resources and operating at low frequencies. In this article, we propose FASE, an FPGA-based accelerator for lightweight sample entropy (LW-SampEn) with Monte Carlo (MC) sampling. The FASE design comprises two main parts: algorithm and hardware optimizations. On the algorithmic side, we introduce MC sampling into the merge-sort-based LW-SampEn algorithm, named MCLW-SampEn. MCLW-SampEn effectively reduces the computation load for large data volumes while maintaining algorithmic accuracy. For hardware, we first design efficient sorting and allocation modules to address boundary localization and load imbalance issues in previous accelerator designs. Then, we replicate the computation across the main phases to enable parallel processing. Finally, we deploy the design on the Pynq-Z2 board for validation. Experimental results show that the proposed MCLW-SampEn algorithm achieves an average speed up of <inline-formula> <tex-math>$3\\\\times $ </tex-math></inline-formula> over the LW-SampEn algorithm, with accuracy losses kept within 0.5%. Compared to state-of-the-art (SOTA) designs, FASE achieves an average speed up of <inline-formula> <tex-math>$12.8\\\\times $ </tex-math></inline-formula> while reducing power consumption by 89.3%. Ablation studies indicate that, for the same algorithm, FASE offers a <inline-formula> <tex-math>$7.4\\\\times $ </tex-math></inline-formula> speedup over related FPGA designs.\",\"PeriodicalId\":13425,\"journal\":{\"name\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"volume\":\"33 10\",\"pages\":\"2883-2896\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11115970/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11115970/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

样本熵（SampEn）是信息熵中的一种算法，能够有效地分析生物信号。由于需要进行大量的相似度匹配操作，SampEn的计算过程非常耗时。尽管已经提出了一系列快速SampEn算法，但在处理大数据量时，它们仍然是耗时的。此外，以前为SampEn设计的基于现场可编程门阵列（FPGA）的硬件加速器受到架构设计的限制，需要消耗大量的片上内存资源，并且工作频率很低。在本文中，我们提出了FASE，一个基于fpga的轻量级样本熵加速器（LW-SampEn）和蒙特卡罗（MC）采样。FASE的设计主要包括两个部分：算法优化和硬件优化。在算法方面，我们将MC采样引入到基于合并排序的LW-SampEn算法中，命名为MCLW-SampEn。MCLW-SampEn有效地降低了大数据量的计算负荷，同时保持了算法的准确性。在硬件方面，我们首先设计了高效的排序和分配模块，以解决先前加速器设计中的边界定位和负载不平衡问题。然后，我们跨主要阶段复制计算以启用并行处理。最后，我们将设计部署在Pynq-Z2板上进行验证。实验结果表明，MCLW-SampEn算法的平均速度比LW-SampEn算法提高了3倍，精度损失保持在0.5%以内。与最先进的（SOTA）设计相比，FASE实现了12.8倍的平均速度提升，同时降低了89.3%的功耗。消融研究表明，对于相同的算法，FASE比相关FPGA设计提供了7.4倍的加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FASE: An FPGA-Based Accelerator for Lightweight Sample Entropy With Monte Carlo Sampling

Sample entropy (SampEn) is an algorithm within information entropy that enables effective analysis of biological signals. Due to the need for extensive similarity matching operations, the SampEn calculation process is time-consuming. Although a series of fast SampEn algorithms have been proposed, they remain time-intensive when processing large data volumes. Additionally, previous field-programmable gate array (FPGA)-based hardware accelerators designed for SampEn suffer from architectural design limitations, consuming substantial on-chip memory resources and operating at low frequencies. In this article, we propose FASE, an FPGA-based accelerator for lightweight sample entropy (LW-SampEn) with Monte Carlo (MC) sampling. The FASE design comprises two main parts: algorithm and hardware optimizations. On the algorithmic side, we introduce MC sampling into the merge-sort-based LW-SampEn algorithm, named MCLW-SampEn. MCLW-SampEn effectively reduces the computation load for large data volumes while maintaining algorithmic accuracy. For hardware, we first design efficient sorting and allocation modules to address boundary localization and load imbalance issues in previous accelerator designs. Then, we replicate the computation across the main phases to enable parallel processing. Finally, we deploy the design on the Pynq-Z2 board for validation. Experimental results show that the proposed MCLW-SampEn algorithm achieves an average speed up of

$3\times $

over the LW-SampEn algorithm, with accuracy losses kept within 0.5%. Compared to state-of-the-art (SOTA) designs, FASE achieves an average speed up of

$12.8\times $

while reducing power consumption by 89.3%. Ablation studies indicate that, for the same algorithm, FASE offers a

$7.4\times $

speedup over related FPGA designs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Very Large Scale Integration (VLSI) Systems 工程技术-工程：电子与电气

CiteScore

6.40

自引率

7.10%

发文量

187

审稿时长

3.6 months

期刊介绍： The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society. Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.