ACM Transactions on Design Automation of Electronic Systems最新文献

筛选
英文 中文
Efficient Attacks on Strong PUFs via Covariance and Boolean Modeling 通过协方差和布尔建模有效攻击强 PUF
IF 2.2 4区 计算机科学
ACM Transactions on Design Automation of Electronic Systems Pub Date : 2024-08-08 DOI: 10.1145/3687469
Hongfei Wang, Wei Liu, Wenjie Cai, Yunxiao Lu, Caixue Wan
{"title":"Efficient Attacks on Strong PUFs via Covariance and Boolean Modeling","authors":"Hongfei Wang, Wei Liu, Wenjie Cai, Yunxiao Lu, Caixue Wan","doi":"10.1145/3687469","DOIUrl":"https://doi.org/10.1145/3687469","url":null,"abstract":"The physical unclonable function (PUF) is a widely used hardware security primitive. Before hacking into a PUF-protected system, intruders typically initiate attacks on the PUF as the first step. Many strong PUF designs have been proposed to thwart non-invasive attacks that exploit acquired CRPs. In this work, we propose a general framework for efficient attacks on strong PUFs by investigating from two perspectives, namely, statistical covariances in the challenge space and the design dependency among PUF compositions. The framework consists of two novel attack methods against a wide range of PUF families, including XOR APUFs, interpose PUFs, and bistable ring (BR)-PUFs. It can also exploit the knowledge of reliability information to improve attack efficiency with gradient optimization. We evaluate our proposed attacks through extensive experiments, running both software-based simulation and hardware implementations on FPGAs to compare with corresponding SOTA works. Considerable effort has been made in ensuring identical software/hardware conditions for a fair comparison. The results demonstrate that our framework significantly outperforms SOTA results. Moreover, we show that our framework can efficiently attack diverse PUF families built from entirely different types, while almost all existing works solely focused on attacking one or very limited number of PUF designs.","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141927072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PriorMSM: An Efficient Acceleration Architecture for Multi-Scalar Multiplication PriorMSM:高效的多乘法加速架构
IF 2.2 4区 计算机科学
ACM Transactions on Design Automation of Electronic Systems Pub Date : 2024-07-12 DOI: 10.1145/3678006
Changxu Liu, Hao Zhou, Patrick Dai, Li Shang, Fan Yang
{"title":"PriorMSM: An Efficient Acceleration Architecture for Multi-Scalar Multiplication","authors":"Changxu Liu, Hao Zhou, Patrick Dai, Li Shang, Fan Yang","doi":"10.1145/3678006","DOIUrl":"https://doi.org/10.1145/3678006","url":null,"abstract":"\u0000 Multi-Scalar Multiplication (MSM) is a computationally intensive task that operates on elliptic curves based on\u0000 GF\u0000 (\u0000 P\u0000 ). It is commonly used in Zero-knowledge proof (ZKP), where it accounts for a significant portion of the computation time required for proof generation. In this paper, we present PriorMSM, an efficient acceleration architecture for MSM. We propose a Priority-based Scheduling Mechanism (PBSM) based on a multi-FIFOs and multi-banks architecture to accelerate the implementation of MSM. By increasing the pairing success rate of internal points, PBSM reduces the number of bubbles in the pipeline of point addition (PADD), consequently improving the data throughput of the pipeline. We also introduce an advanced parallel bucket aggregation algorithm, leveraging PADD’s fully pipelined characteristics to significantly accelerate the implementation of bucket aggregation. We perform a sensitivity analysis on the crucial parameter, window size, in MSM. The results indicate that the window size of the MSM significantly impacts its latency. Area-Time Product (ATP) metric is introduced to guide the selection of the optimal window size, balancing the performance and cost for practical applications of subsequent MSM implementations. PriorMSM is evaluated using the TSMC 28nm process. It achieves a maximum speedup of 10.9 × compared to the previous custom hardware implementations and a maximum speedup of 3.9 × compared to the GPU implementations.\u0000","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141652667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Stream Scheduling of Inference Pipelines on Edge Devices - a DRL Approach 边缘设备推理流水线的多流调度--一种 DRL 方法
IF 2.2 4区 计算机科学
ACM Transactions on Design Automation of Electronic Systems Pub Date : 2024-07-11 DOI: 10.1145/3677378
Danny Pereira, Sumana Ghosh, Soumyajit Dey
{"title":"Multi-Stream Scheduling of Inference Pipelines on Edge Devices - a DRL Approach","authors":"Danny Pereira, Sumana Ghosh, Soumyajit Dey","doi":"10.1145/3677378","DOIUrl":"https://doi.org/10.1145/3677378","url":null,"abstract":"\u0000 Low-power edge devices equipped with Graphics Processing Units (GPUs) are a popular target platform for real-time scheduling of inference pipelines. Such application-architecture combinations are popular in Advanced Driver-Assistance Systems (ADAS) for aiding in the real-time decision-making of automotive controllers. However, the real-time throughput sustainable by such inference pipelines is limited by resource constraints of the target edge devices. Modern GPUs, both in edge devices and workstation variants, support the facility of concurrent execution of computation kernels and data transfers using the primitive of\u0000 streams\u0000 , also allowing for the assignment of priority to these streams. This opens up the possibility of executing computation layers of inference pipelines within a multi-priority, multi-stream environment on the GPU. However, manually co-scheduling such applications while satisfying their throughput requirement and platform memory budget may require an unmanageable number of profiling runs. In this work, we propose a Deep Reinforcement Learning (DRL) based method for deciding the start time of various operations in each pipeline layer while optimizing the latency of execution of inference pipelines as well as memory consumption. Experimental results demonstrate the promising efficacy of the proposed DRL approach in comparison with the baseline methods, particularly in terms of real-time performance enhancements, schedulability ratio, and memory savings. We have additionally assessed the effectiveness of the proposed DRL approach using a real-time traffic simulation tool IPG CarMaker.\u0000","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141658363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Power Optimization Approach for Large-scale RM-TB Dual Logic Circuits Based on an Adaptive Multi-Task Intelligent Algorithm 基于自适应多任务智能算法的大规模 RM-TB 双逻辑电路功率优化方法
IF 2.2 4区 计算机科学
ACM Transactions on Design Automation of Electronic Systems Pub Date : 2024-07-10 DOI: 10.1145/3677033
Xiaoqian Wu, Huaxiao Liu, Peng Wang, Lei Liu, Zhenxue He
{"title":"A Power Optimization Approach for Large-scale RM-TB Dual Logic Circuits Based on an Adaptive Multi-Task Intelligent Algorithm","authors":"Xiaoqian Wu, Huaxiao Liu, Peng Wang, Lei Liu, Zhenxue He","doi":"10.1145/3677033","DOIUrl":"https://doi.org/10.1145/3677033","url":null,"abstract":"Logic synthesis is a crucial step in integrated circuit design, and power optimization is an indispensable part of this process. However, power optimization for large-scale Mixed Polarity Reed-Muller (MPRM) logic circuits is an NP-hard problem. In this paper, we divide Boolean circuits into small-scale circuits based on the idea of divide and conquer using the proposed Dynamic Adaptive Grouping Strategy (DAGS) and the proposed circuit decomposition model. Each small-scale Boolean circuit is transformed into an MPRM logic circuit by a polarity transformation algorithm. Based on the gate-level integration, we integrate small-scale circuits into an MPRM and Boolean Dual Logic (RBDL) circuit. Furthermore, the power optimization problem of RBDL circuits is a multi-task, multi-extremal, high-dimensional combinatorial optimization problem, for which we propose an Adaptive Multi-task Intelligent Algorithm (AMIA), which includes global task optimization, population reproduction, valuable knowledge transfer, and local exploration to search for the lowest power for RBDL circuits. Moreover, based on the proposed Fast Power Decomposition Algorithm (FPDA), we proposed a Power Optimization Approach (POA) for an RBDL circuit with the lowest power using the AMIA. Experimental results based on Microelectronics Center of North Carolina (MCNC) Benchmark test circuits demonstrate the effectiveness and superiority of the POA compared to state-of-the-art power optimization approaches.","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141659979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MAB-BMC: A Formal Verification Enhancer by Harnessing Multiple BMC Engines Together MAB-BMC:同时利用多个 BMC 引擎的形式化验证增强器
IF 2.2 4区 计算机科学
ACM Transactions on Design Automation of Electronic Systems Pub Date : 2024-07-02 DOI: 10.1145/3675168
Devleena Ghosh, Sumana Ghosh, Ansuman Banerjee, R. Gajavelly, Sudhakar Surendran
{"title":"MAB-BMC: A Formal Verification Enhancer by Harnessing Multiple BMC Engines Together","authors":"Devleena Ghosh, Sumana Ghosh, Ansuman Banerjee, R. Gajavelly, Sudhakar Surendran","doi":"10.1145/3675168","DOIUrl":"https://doi.org/10.1145/3675168","url":null,"abstract":"In recent times, Bounded Model Checking (BMC) engines have gained wide prominence in formal verification. Different BMC engines exist, differing in their optimization, representations and solving mechanisms used to represent and navigate the underlying state transition of the given design to be verified. The objective of this paper is to examine if combinations of BMC engines can help to combine their strengths. We propose an approach that can create a sequencing of BMC engines that can reach better depth in formal verification, as opposed to executing them alone for a specified time. Our approach uses machine learning, specifically, the Multi-Armed Bandit paradigm of reinforcement learning, to predict the best-performing BMC engine for a given unrolling depth of the underlying circuit design. We evaluate our approach on a set of benchmark designs from the Hardware Model Checking Competition (HWMCC) benchmarks and show that it outperforms the state-of-the-art BMC engines in terms of the depth reached or time taken to deduce a property violation. The synthesized BMC engine sequences reach better depths than HWMCC results and the state-of-the-art technique, super_deep, for more than 80% of the cases. It also outperforms single engine runs for more than 92% of the cases where a property violation is not found within a given time duration. For designs where property violations are found within the given time duration, the synthesized sequences found the property violation in a lesser time than HWMCC for all the designs and outperformed both super_deep and single engine runs for more than 87% of the designs.","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141685856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Single Bitline Highly Stable, Low Power With High Speed Half-Select Disturb Free 11T SRAM Cell 单比特线高稳定、低功耗、高速半选择无干扰 11T SRAM 单元
IF 1.4 4区 计算机科学
ACM Transactions on Design Automation of Electronic Systems Pub Date : 2024-06-19 DOI: 10.1145/3653675
Lokesh Soni, Neeta Pandey
{"title":"A Single Bitline Highly Stable, Low Power With High Speed Half-Select Disturb Free 11T SRAM Cell","authors":"Lokesh Soni, Neeta Pandey","doi":"10.1145/3653675","DOIUrl":"https://doi.org/10.1145/3653675","url":null,"abstract":"<p>A half-select disturb-free 11T (HF11T) static random access memory (SRAM) cell with low power, better stability and high speed is presented in this paper. The proposed SRAM cell works well with bit-interleaving design, which enhances soft-error immunity. A comparison of the proposed HF11T cell with other cutting-edge designs such as single-ended HS free 11T (SEHF11T), a shared-pass-gate 11T (SPG11T), data-dependent stack PMOS switching 10T (DSPS10T), a single-ended half-selected robust 12T (HSR12T), and 11T SRAM cells has been made. It exhibits 4.85 × /9.19 × less read delay (<i>T<sub>RA</sub></i>) and write delay (<i>T<sub>WA</sub></i>), respectively as compared to other considered SRAM cells. It achieves 1.07 × /1.02 × better read and write stability, respectively than the considered SRAM cells. It shows maximum reduction of 1.68 × /4.58 × /94.72 × /9 × /145 × leakage power, read power, write power consumption, read power delay product (PDP) and write PDP respectively, than the considered SRAM cells. In addition, the proposed HF11T cell achieves 10.14 × higher <i>I<sub>on</sub></i>/<i>I<sub>off</sub></i> ratio than the other compared cells. These improvements come with a trade-off, resulting in 1.13 × more <i>T<sub>RA</sub></i> compared to SPG11T. The simulation is performed with Cadence Virtuoso 45nm CMOS technology at supply voltage (<i>V<sub>DD</sub></i>) of 0.6 V.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141505520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Cost-Driven Chip Partitioning Method for Heterogeneous 3D Integration 异构三维集成的成本驱动型芯片分区方法
IF 1.4 4区 计算机科学
ACM Transactions on Design Automation of Electronic Systems Pub Date : 2024-06-14 DOI: 10.1145/3672558
Cheng-Hsien Lin, Kuan-Ting Chen, Yi-Yu Liu, Allen C.-H. Wu, TingTing Hwang
{"title":"A Cost-Driven Chip Partitioning Method for Heterogeneous 3D Integration","authors":"Cheng-Hsien Lin, Kuan-Ting Chen, Yi-Yu Liu, Allen C.-H. Wu, TingTing Hwang","doi":"10.1145/3672558","DOIUrl":"https://doi.org/10.1145/3672558","url":null,"abstract":"3D IC offers significant benefits in terms of performance and cost. Existing research in through-silicon via (TSV)-based 3D integration circuit (IC) partitioning has focused on minimizing the number of TSVs to reduce costs. Partitioning methods based on heterogeneous integration have emerged as viable approaches for cost optimization. Leveraging mature processes to manufacture not timing-critical blocks can yield cost benefits. Nevertheless, none of the previous 3D partitioning work has focused on reducing the overall cost, including both design and manufacturing costs, for heterogeneous 3D integration. Moreover, throughput constraints have not been considered. This paper presents a cost-aware integer linear programming (ILP)-based formulation and a heuristic algorithm that partition the functional blocks in the design into different technological groups. Each group of functional blocks will be implemented using a particular process technology, and then integrated into a 3D IC. Our results show that 3D heterogeneous integration chip implementation can reduce overall cost while satisfying various timing constraints.","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141341376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Correction of Arithmetic Circuits in the Presence of Multiple Bugs by Groebner Basis Modification 通过格罗伊布纳基础修改自动修正存在多重缺陷的算术电路
IF 1.4 4区 计算机科学
ACM Transactions on Design Automation of Electronic Systems Pub Date : 2024-06-12 DOI: 10.1145/3672559
Negar Aghapour Sabbagh, B. Alizadeh
{"title":"Automatic Correction of Arithmetic Circuits in the Presence of Multiple Bugs by Groebner Basis Modification","authors":"Negar Aghapour Sabbagh, B. Alizadeh","doi":"10.1145/3672559","DOIUrl":"https://doi.org/10.1145/3672559","url":null,"abstract":"One promising approach to verify large arithmetic circuits is making use of Symbolic Computer Algebra (SCA), where the circuit and the specification are translated to a set of polynomials, and the verification is performed by the ideal membership testing. Here, the main problem is the monomial explosion for buggy arithmetic circuits, which makes obtaining the word-level remainder become unfeasible. So, automatic correction of such circuits remains a significant challenge. Our proposed correction method partitions the circuit based on primary output bits and modifies the related Groebner basis based on the given suspicious gates, which makes it independent of the word-level remainder. We have applied our method to various signed and unsigned multipliers, with various sizes and numbers of suspicious and buggy gates. The results show that the proposed method corrects the bugs without area overhead. Moreover, it is able to correct the buggy circuit on average 51.9 × and 45.72 × faster in comparison with the state-of-the-art correction techniques, having single and multiple bugs, respectively.","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141351720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Power, Performance, and Area for On-Sensor Deployment of AR/VR Workloads Using an Analytical Framework 使用分析框架估算 AR/VR 工作负载传感器上部署的功率、性能和面积
IF 1.4 4区 计算机科学
ACM Transactions on Design Automation of Electronic Systems Pub Date : 2024-06-07 DOI: 10.1145/3670404
Xiaoyu Sun, Xiaochen Peng, Sai Zhang, J. Gómez, W. Khwa, Syed Sarwar, Ziyun Li, Weidong Cao, Zhao Wang, Chiao Liu, Meng-Fan Chang, B. Salvo, Kerem Akarvardar, H.-S. Philip Wong
{"title":"Estimating Power, Performance, and Area for On-Sensor Deployment of AR/VR Workloads Using an Analytical Framework","authors":"Xiaoyu Sun, Xiaochen Peng, Sai Zhang, J. Gómez, W. Khwa, Syed Sarwar, Ziyun Li, Weidong Cao, Zhao Wang, Chiao Liu, Meng-Fan Chang, B. Salvo, Kerem Akarvardar, H.-S. Philip Wong","doi":"10.1145/3670404","DOIUrl":"https://doi.org/10.1145/3670404","url":null,"abstract":"Augmented Reality and Virtual Reality have emerged as the next frontier of intelligent image sensors and computer systems. In these systems, 3D die stacking stands out as a compelling solution, enabling in-situ processing capability of the sensory data for tasks such as image classification and object detection at low power, low latency, and a small form factor. These intelligent 3D CMOS Image Sensor (CIS) systems present a wide design space, encompassing multiple domains (e.g., computer vision algorithms, circuit design, system architecture, and semiconductor technology, including 3D stacking) that have not been explored in-depth so far. This paper aims to fill this gap. We first present an analytical evaluation framework, STAR-3DSim, dedicated to rapid pre-RTL evaluation of 3D-CIS systems capturing the entire stack from the pixel layer to the on-sensor processor layer. With STAR-3DSim, we then propose several knobs for PPA (power, performance, area) improvement of the Deep Neural Network (DNN) accelerator that can provide up to 53%, 41%, and 63% reduction in energy, latency, and area, respectively, across a broad set of relevant AR/VR workloads. Lastly, we present full-system evaluation results by taking image sensing, cross-tier data transfer, and off-sensor communication into consideration.","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141373733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing Hyperdimensional Computing Based on Trainable Encoding and Adaptive Training for Efficient and Accurate Learning 推进基于可训练编码和自适应训练的超维计算,实现高效准确学习
IF 1.4 4区 计算机科学
ACM Transactions on Design Automation of Electronic Systems Pub Date : 2024-06-04 DOI: 10.1145/3665891
Jiseung Kim, Hyunsei Lee, Mohsen Imani, Yeseong Kim
{"title":"Advancing Hyperdimensional Computing Based on Trainable Encoding and Adaptive Training for Efficient and Accurate Learning","authors":"Jiseung Kim, Hyunsei Lee, Mohsen Imani, Yeseong Kim","doi":"10.1145/3665891","DOIUrl":"https://doi.org/10.1145/3665891","url":null,"abstract":"<p>Hyperdimensional computing (HDC) is a computing paradigm inspired by the mechanisms of human memory, characterizing data through high-dimensional vector representations, known as hypervectors. Recent advancements in HDC have explored its potential as a learning model, leveraging its straightforward arithmetic and high efficiency. The traditional HDC frameworks are hampered by two primary static elements: randomly generated encoders and fixed learning rates. These static components significantly limit model adaptability and accuracy. The static, randomly generated encoders, while ensuring high-dimensional representation, fail to adapt to evolving data relationships, thereby constraining the model’s ability to accurately capture and learn from complex patterns. Similarly, the fixed nature of the learning rate does not account for the varying needs of the training process over time, hindering efficient convergence and optimal performance. This paper introduces (mathsf {TrainableHD} ), a novel HDC framework that enables dynamic training of the randomly generated encoder depending on the feedback of the learning data, thereby addressing the static nature of conventional HDC encoders. (mathsf {TrainableHD} ) also enhances the training performance by incorporating adaptive optimizer algorithms in learning the hypervectors. We further refine (mathsf {TrainableHD} ) with effective quantization to enhance efficiency, allowing the execution of the inference phase in low-precision accelerators. Our evaluations demonstrate that (mathsf {TrainableHD} ) significantly improves HDC accuracy by up to 27.99% (averaging 7.02%) without additional computational costs during inference, achieving a performance level comparable to state-of-the-art deep learning models. Furthermore, (mathsf {TrainableHD} ) is optimized for execution speed and energy efficiency. Compared to deep learning on a low-power GPU platform like NVIDIA Jetson Xavier, (mathsf {TrainableHD} ) is 56.4 times faster and 73 times more energy efficient. This efficiency is further augmented through the use of Encoder Interval Training (EIT) and adaptive optimizer algorithms, enhancing the training process without compromising the model’s accuracy.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141253460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信