2022 25th Euromicro Conference on Digital System Design (DSD)最新文献_第5页

Energy-Efficient Radix-4 Belief Propagation Polar Code Decoding Using an Efficient Sign-Magnitude Adder and Clock Gating 基于高效符号幅度加法器和时钟门控的高效基数4信念传播极性码译码

2022 25th Euromicro Conference on Digital System Design (DSD) Pub Date : 2022-08-01 DOI: 10.1109/DSD57027.2022.00026

O. Meteer, Arvid B. Van Den Brink, M. Bekooij

{"title":"Energy-Efficient Radix-4 Belief Propagation Polar Code Decoding Using an Efficient Sign-Magnitude Adder and Clock Gating","authors":"O. Meteer, Arvid B. Van Den Brink, M. Bekooij","doi":"10.1109/DSD57027.2022.00026","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00026","url":null,"abstract":"Polar encoding is the first information coding method that has been proven to achieve channel capacity for binary-input discrete memoryless channels. Since its introduction, much research has been done on improving decoding performance, execution time and energy efficiency. Classic belief propagation uses radix-2 decoding, but a recent study proposed radix-4 decoding which reduces memory usage by 50%. However a drawback is its higher computational complexity, negatively impacting energy usage and throughput. In this paper we present an energy-efficient radix-4 belief propagation polar decoder architecture that uses a new sign-magnitude adder that does not require conversion to two's complement and back. On top of that we also propose using clock gating of input values by checking if all $R$ inputs of the decoder are zero. These two key contributions lead to a more energy -efficient design that is smaller and has higher maximum clock speed and throughput. Post-layout simulation results show that compared to the previously proposed 1024-bit radix-4 belief propagation polar code decoder, our decoder uses between 30.22 % and 32.80 % less power and is 5.2 % smaller at the same clock speed. Also, our design can achieve a 15.7% higher clock speed at which it is still up to 10.76% more power efficient and 4.8% smaller.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114168466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Majority-based Approximate Adder for FPGAs fpga中基于多数的近似加法器

2022 25th Euromicro Conference on Digital System Design (DSD) Pub Date : 2022-08-01 DOI: 10.1109/DSD57027.2022.00017

B. Ghavami, Mahdi Sajedi, Mohsen Raji, Zhenman Fang, Lesley Shannon

{"title":"A Majority-based Approximate Adder for FPGAs","authors":"B. Ghavami, Mahdi Sajedi, Mohsen Raji, Zhenman Fang, Lesley Shannon","doi":"10.1109/DSD57027.2022.00017","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00017","url":null,"abstract":"The most advanced ASIC-based approximate adders are focused on gate or transistor level approximating structures. However, due to architectural differences between ASIC and FPGA, comparable performance gains for FPGA-based approximate adders cannot be obtained using ASIC-based approximation ones. In this paper, we propose a method for designing a low-error approximate adder that effectively deploys the modern FPGA structure. We introduce an FPGA-based approximate adder, named as Majority Approximate Adder (MAA), with less error than the advanced approximate adders. MAA is constructed using an approximate part and an accurate one; i.e. the accurate part is based on a smaller carry-chain compared with the carry-chain of the corresponding accurate adder. In addition, approximate part is designed to use FPGA resources efficiently with a low mean error distance (MED). Experimental results based on Monte-Carlo simulation demonstrates that a 16-bit MAA has a 49.92% lower MED than the state of the art FPGA-based approximate adder. MAA also takes up less area and consumes less power than other FPGA-based approximate adders in the literature.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114908853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PositIV:A Configurable Posit Processor Architecture for Image and Video Processing PositIV:一种用于图像和视频处理的可配置的Posit处理器架构

2022 25th Euromicro Conference on Digital System Design (DSD) Pub Date : 2022-08-01 DOI: 10.1109/DSD57027.2022.00022

Akshat Ramachandran, John L. Gustafson, Anusua Roy, R. Ansari, R. Daruwala

{"title":"PositIV:A Configurable Posit Processor Architecture for Image and Video Processing","authors":"Akshat Ramachandran, John L. Gustafson, Anusua Roy, R. Ansari, R. Daruwala","doi":"10.1109/DSD57027.2022.00022","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00022","url":null,"abstract":"Image processing is essential for applications such as robot vision, remote sensing, computational photography, augmented reality etc. In the design of dedicated hardware for such applications, IEEE Std 754™ floating point (float) arithmetic units have been widely used. While float-based architectures have achieved favorable results, their hardware is complicated and requires a large silicon footprint. In this paper we propose a Posit-based Image and Video processor (PositIV), a completely pipelined, configurable, image processor using posit arithmetic that guarantees lower power use and smaller silicon footprint than floats. PositIV is able to effectively overlap computation with memory access and supports multidimensional addressing, virtual border handling, prefetching and buffering. It is successfully able to integrate configurability, flexibility, and ease of development with real-time performance characteristics. The performance of PositIV is validated on several image processing algorithms for different configurations and compared against state-of-the-art implementations. Additionally, we empirically demonstrate the superiority of posits in processing images for several conventional algorithms, achieving at least 35–40% improvement in image quality over standard floats.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114882306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Technology Mapping for PAIG Optimised Polymorphic Circuits pag优化多态电路的技术映射

2022 25th Euromicro Conference on Digital System Design (DSD) Pub Date : 2022-08-01 DOI: 10.1109/DSD57027.2022.00112

R. Ruzicka, Václav Simek

{"title":"Technology Mapping for PAIG Optimised Polymorphic Circuits","authors":"R. Ruzicka, Václav Simek","doi":"10.1109/DSD57027.2022.00112","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00112","url":null,"abstract":"The concept of polymorphic electronics allows to efficiently implement two or more functions in a single circuit. It is characteristic of that approach that the currently selected function from the set of available ones depends on the state of the circuit operating environment. The key components of such circuits are polymorphic gates. Since the introduction of polymorphic electronics, just a few tens of polymorphic gates have been published. However, a large number of them exhibit parameters that fall behind ubiquitous CMOS technology, which makes their utilization for real applications rather difficult. As it turns out, the synthesis of polymorphic circuits achieves a significantly higher degree of complexity in comparison to the ordinary digital circuit. In past, many of the previously reported polymorphic circuits were designed using evolutionary principles (EA, CGP, etc.). It has been shown that the problem of scalable synthesis techniques suitable for large-scale polymorphic circuits could be addressed by the adoption of multi-level synthesis techniques such as And-Inverter-Graphs. The PAIG (Polymorphic And-Inverter-Graphs) concept and synthesis techniques based on it seem to be a promising approach. This paper shows how modern polymorphic gates could be used in combination with a PAIG-based synthesis tool to obtain an efficient implementation of complex polymorphic circuits.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"28 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125451251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient Modular Polynomial Multiplier for NTT Accelerator of Crystals-Kyber 晶体- kyber NTT加速器的高效模多项式倍增器

2022 25th Euromicro Conference on Digital System Design (DSD) Pub Date : 2022-08-01 DOI: 10.1109/DSD57027.2022.00076

Yuma Itabashi, Rei Ueno, N. Homma

引用次数: 0

CaW-NAS: Compression Aware Neural Architecture Search 压缩感知神经结构搜索

2022 25th Euromicro Conference on Digital System Design (DSD) Pub Date : 2022-08-01 DOI: 10.1109/DSD57027.2022.00059

Hadjer Benmeziane, Hamza Ouranoughi, S. Niar, Kaoutar El Maghraoui

引用次数: 0

Event-Driven Programming of FPGA-accelerated ROS 2 Robotics Applications fpga加速ROS 2机器人应用的事件驱动编程

2022 25th Euromicro Conference on Digital System Design (DSD) Pub Date : 2022-08-01 DOI: 10.1109/DSD57027.2022.00088

Christian Lienen, M. Platzner

{"title":"Event-Driven Programming of FPGA-accelerated ROS 2 Robotics Applications","authors":"Christian Lienen, M. Platzner","doi":"10.1109/DSD57027.2022.00088","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00088","url":null,"abstract":"Many applications from the robotics domain can benefit from FPGA acceleration. A corresponding key question is not only how to integrate hardware accelerators into software-centric robotics programming environments but also how to integrate more advanced approaches like dynamic partial reconfiguration. Recently, several approaches have demonstrated hardware acceleration for the robot operating system (ROS), the dominant programming environment in robotics. ROS is a middleware layer that features the composition of complex robotics applications as a set of nodes that communicate via mechanisms such as publish/subscribe, and distributes them over several compute platforms. In this paper, we present a novel approach for event-based programming of robotics applications that leverages dynamic partial reconfiguration and ReconROS, a framework for flexibly mapping ROS 2 nodes to either software or reconfigurable hardware. The approach bases on the ReconROS executor that schedules callbacks of ROS 2 nodes and utilizes a reconfigurable slot model and partial runtime reconfiguration to load hardware-based callbacks on demand. We describe the ReconROS executor approach, give design examples, and experimentally evaluate its functionality with examples.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124183501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Nonlinear Compression Block Codes Search Strategy 非线性压缩分组码搜索策略

2022 25th Euromicro Conference on Digital System Design (DSD) Pub Date : 2022-08-01 DOI: 10.1109/DSD57027.2022.00094

O. Novák

引用次数: 1

Inference Time Reduction of Deep Neural Networks on Embedded Devices: A Case Study 嵌入式设备上深度神经网络的推理时间缩减:一个案例研究

2022 25th Euromicro Conference on Digital System Design (DSD) Pub Date : 2022-08-01 DOI: 10.1109/DSD57027.2022.00036

Isma-Ilou Sadou, Seyed Morteza Nabavinejad, Zhonghai Lu, Masoumeh Ebrahimi

{"title":"Inference Time Reduction of Deep Neural Networks on Embedded Devices: A Case Study","authors":"Isma-Ilou Sadou, Seyed Morteza Nabavinejad, Zhonghai Lu, Masoumeh Ebrahimi","doi":"10.1109/DSD57027.2022.00036","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00036","url":null,"abstract":"From object detection to semantic segmentation, deep learning has achieved many groundbreaking results in recent years. However, due to the increasing complexity, the execution of neural networks on embedded platforms is greatly hindered. This has motivated the development of several neural network minimisation techniques, amongst which pruning has gained a lot of focus. In this work, we perform a case study on a series of methods with the goal of finding a small model that could run fast on embedded devices. First, we suggest a simple, but effective, ranking criterion for filter pruning called Mean Weight. Then, we combine this new criterion with a threshold-aware layer-sensitive filter pruning method, called T-sensitive pruning, to gain high accuracy. Further, the pruning algorithm follows a structured filter pruning approach that removes all selected filters and their dependencies from the DNN model, leading to less computations, and thus low inference time in lower-end CPUs. To validate the effectiveness of the proposed method, we perform experiments on three different datasets (with 3, 101, and 1000 classes) and two different deep neural networks (i.e., SICK-Net and MobileNet V1). We have obtained speedups of up to 13x on lower-end CPUs (Armv8) with less than 1% drop in accuracy. This satisfies the goal of transferring deep neural networks to embedded hardware while attaining a good trade-off between inference time and accuracy.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131474762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

RISC-V Core with Approximate Multiplier for Error-Tolerant Applications RISC-V内核与近似乘法器的容错应用

2022 25th Euromicro Conference on Digital System Design (DSD) Pub Date : 2022-08-01 DOI: 10.1109/DSD57027.2022.00040

Anuj Verma, Priyamvada Sharma, B. P. Das

{"title":"RISC-V Core with Approximate Multiplier for Error-Tolerant Applications","authors":"Anuj Verma, Priyamvada Sharma, B. P. Das","doi":"10.1109/DSD57027.2022.00040","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00040","url":null,"abstract":"RISC-V is an open-source instruction set architecture with customizable extensions to introduce operations like multiplication, division, atomic functions, and floating-point operations. In this paper, a new approximate multiplier is integrated with RI5CY (CV32E40P) processor, which can perform integer and floating-point multiplication for error-tolerant applications. The multiplication operation is required in various engineering and scientific applications, including image processing, digital signal processing, and many others. The proposed approximate multiplier is based on linear CORDIC (COordinate Rotation Digital Computer) algorithm and implemented by using only shift-add operations. It can perform multiplication and MAC (Multiply and accumulate) operations. The FPGA (Field programmable gate arrays) implementation results and ASIC (Application-specific integrated circuit) synthesis results for the proposed approximate multiplier along with RI5CY core are reported. The proposed design with RI5CY core is implemented on FPGA Xilinx Zedboard, which improves the performance by 20% and reduces power delay product (PDP) by 15.79% over the existing multipliers of the RI5CY core. Moreover, RI5CY core with the proposed approximate multiplier is synthesized using Industrial 130 nm standard cell library (ISCL) and Sub-threshold 130 nm standard cell library (STSCL) in Synopsys DC compiler. In the case of STSCL, RI5CY core with proposed approximate multiplier has 11.76% less power-consumption, 27.27% less delay, and 38.77% PDP compared to the existing multipliers of the RI5CY core.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131694057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1