Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design最新文献

筛选
英文 中文
COSIME COSIME
Cheung Liu, Haobang Chen, M. Imani, Kai Ni, A. Kazemi, Ann Franchesca Laguna, M. Niemier, X. Hu, Liang Zhao, Cheng Zhuo, Xunzhao Yin
{"title":"COSIME","authors":"Cheung Liu, Haobang Chen, M. Imani, Kai Ni, A. Kazemi, Ann Franchesca Laguna, M. Niemier, X. Hu, Liang Zhao, Cheng Zhuo, Xunzhao Yin","doi":"10.1145/3508352.3549412","DOIUrl":"https://doi.org/10.1145/3508352.3549412","url":null,"abstract":"In a number of machine learning models, an input query is searched across the trained class vectors to find the closest feature class vector in cosine similarity metric. However, performing the cosine similarities between the vectors in Von-Neumann machines involves a large number of multiplications, Euclidean normalizations and division operations, thus incurring heavy hardware energy and latency overheads. Moreover, due to the memory wall problem that presents in the conventional architecture, frequent cosine similarity-based searches (CSSs) over the class vectors requires a lot of data movements, limiting the throughput and efficiency of the system. To overcome the aforementioned challenges, this paper introduces COSIME, a general in-memory associative memory (AM) engine based on the ferroelectric FET (FeFET) device for efficient CSS. By leveraging the one-transistor AND gate function of FeFET devices, current-based translinear analog circuit and winner-take-all (WTA) circuitry, COSIME can realize parallel in-memory CSS across all the entries in a memory block, and output the closest word to the input query in cosine similarity metric. Evaluation results at the array level suggest that the proposed COSIME design achieves 333× and 90.5× latency and energy improvements, respectively, and realizes better classification accuracy when compared with an AM design implementing approximated CSS. The proposed in-memory computing fabric is evaluated for an HDC problem, showcasing that COSIME can achieve on average 47.1× and 98.5× speedup and energy efficiency improvements compared with an GPU implementation.","PeriodicalId":367046,"journal":{"name":"Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114812457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
SHAPE 形状
Yuan Xu, Tiancheng He, Ruiqi Sun, Yeh-Hao Ma, Yier Jin, An Zou
{"title":"SHAPE","authors":"Yuan Xu, Tiancheng He, Ruiqi Sun, Yeh-Hao Ma, Yier Jin, An Zou","doi":"10.1145/3508352.3549409","DOIUrl":"https://doi.org/10.1145/3508352.3549409","url":null,"abstract":"Despite being employed in burgeoning efforts to accelerate artificial intelligence, heterogeneous architectures have yet to be well managed with strict timing constraints. As a classic task model, multi-segment self-suspension (MSSS) has been proposed for general I/O-intensive systems and computation offloading. However, directly applying this model to heterogeneous architectures with multiple CPUs and many processing units (PEs) suffers tremendous pessimism. In this paper, we present a real-time scheduling approach, SHAPE, for general heterogeneous architectures with significant schedulability and improved utilization rate. We start with building the general task execution pattern on a heterogeneous architecture integrating multiple CPU cores and many PEs such as GPU streaming multiprocessors and FPGA IP cores. A real-time scheduling strategy and corresponding schedulability analysis are presented following the task execution pattern. Compared with state-of-the-art scheduling algorithms through comprehensive experiments on unified and versatile tasks, SHAPE improves the schedulability by 11.1% - 100%. Moreover, experiments performed on the NVIDIA GPU systems further indicate up to 70.9% of pessimism reduction can be achieved by the proposed scheduling. Since we target general heterogeneous architectures, SHAPE can be directly applied to off-the-shelf heterogeneous computing systems with guaranteed deadlines and improved schedulability.","PeriodicalId":367046,"journal":{"name":"Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122115453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
QuBRIM
Yiqiao Zhang, Uday Kumar Reddy Vengalam, Anshujit Sharma, Michael Huang, Z. Ignjatovic
{"title":"QuBRIM","authors":"Yiqiao Zhang, Uday Kumar Reddy Vengalam, Anshujit Sharma, Michael Huang, Z. Ignjatovic","doi":"10.1145/3508352.3549443","DOIUrl":"https://doi.org/10.1145/3508352.3549443","url":null,"abstract":"Physical Ising machines have been shown to solve combinatoric optimization problems with orders-of-magnitude improvements in speed and energy efficiency over v on Neumann systems. However, building such a system is still in its infancy and a scalable, robust implementation remains challenging. CMOS- compatible electronic Ising machines (e.g., [1]) are promising as the mature technology helps bring scale, speed, and energy efficiency to the dynamical system. However, subtle issues can arise when using voltage-controlled transistors to act as programmable resistive coupling. In this paper, we propose a version of resistively-coupled Ising machine using quantized nodal interactions (QuBRIM), which significantly improved the predictability of the coupling resistor. The functionality of QuBRIM is demonstrated by solving the well-known Max-Cut problem using both behavioral and circuit level simulations in 45 nm CMOS technology node. We show that the dynamical system naturally seeks local minima in the objective function's energy landscape and that by applying spin-fix annealing, the system reaches a global minimum with a high probability.","PeriodicalId":367046,"journal":{"name":"Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117335153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ScaleHD
Sizhe Zhang, M. Imani, Xun Jiao
{"title":"ScaleHD","authors":"Sizhe Zhang, M. Imani, Xun Jiao","doi":"10.1145/3508352.3549376","DOIUrl":"https://doi.org/10.1145/3508352.3549376","url":null,"abstract":"Brain-inspired hyperdimensional computing (HDC) has demonstrated promising capability in various cognition tasks such as robotics, bio-medical signal analysis, and natural language processing. Compared to deep neural networks, HDC models show advantages such as light-weight model and one/few-shot learning capabilities, making it a promising alternative paradigm to traditional resource-demanding deep learning models particularly in edge devices with limited resources. Despite the growing popularity of HDC, the robustness of HDC models and the approaches to enhance HDC robustness has not been systematically analyzed and sufficiently examined. HDC relies on high-dimensional numerical vectors referred to as hypervectors (HV) to perform cognition tasks and the values inside the HVs are critical to the robustness of an HDC model. We propose ScaleHD, an adaptive scaling method that scales the value of HVs in the associative memory of an HDC model to enhance the robustness of HDC models. We propose three different modes of ScaleHD including Global-ScaleHD, Class-ScaleHD, and (Class + Clip)-ScaleHD which are based on different adaptive scaling strategies. Results show that ScaleHD is able to enhance HDC robustness against memory errors up to 10, 000X. Moreover, we leverage the enhanced HDC robustness in exchange for energy saving via voltage scaling method. Experimental results show that ScaleHD can reduce energy consumption on HDC memory system up to 72.2% with less than 1% accuracy loss.","PeriodicalId":367046,"journal":{"name":"Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122598784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
TransSizer
S. Nath, G. Pradipta, Corey Hu, Tian Yang, Brucek Khailany, Haoxing Ren
{"title":"TransSizer","authors":"S. Nath, G. Pradipta, Corey Hu, Tian Yang, Brucek Khailany, Haoxing Ren","doi":"10.1145/3508352.3549442","DOIUrl":"https://doi.org/10.1145/3508352.3549442","url":null,"abstract":"Gate sizing is a fundamental netlist optimization move and researchers have used supervised learning-based models in gate sizers. Recently, Reinforcement Learning (RL) has been tried for sizing gates (and other EDA optimization problems) but are very runtime-intensive. In this work, we explore a novel Transformer-based gate sizer, TransSizer, to directly generate optimized gate sizes given a placed and unoptimized netlist. TransSizer is trained on datasets obtained from real tapeout-quality industrial designs in a foundry 5nm technology node. Our results indicate that TransSizer achieves 97% accuracy in predicting optimized gate sizes at the postroute optimization stage. Furthermore, TransSizer has a speedup of ∼1400X while delivering similar timing, power and area metrics when compared to a leading-edge commercial tool for sizing-only optimization.","PeriodicalId":367046,"journal":{"name":"Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129456776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Session details: Special Session: The Dawn of Domain-Specific Hardware Accelerators for Robotic Computing 特别会议:机器人计算领域专用硬件加速器的曙光
Jiang Hu
{"title":"Session details: Special Session: The Dawn of Domain-Specific Hardware Accelerators for Robotic Computing","authors":"Jiang Hu","doi":"10.1145/3578460","DOIUrl":"https://doi.org/10.1145/3578460","url":null,"abstract":"","PeriodicalId":367046,"journal":{"name":"Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126577084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Routing with Cell Movement (Virtual) 会话细节:路由与单元移动(虚拟)
Guojie Luo
{"title":"Session details: Routing with Cell Movement (Virtual)","authors":"Guojie Luo","doi":"10.1145/3578469","DOIUrl":"https://doi.org/10.1145/3578469","url":null,"abstract":"","PeriodicalId":367046,"journal":{"name":"Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134583842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Energy Efficient Neural Networks via Approximate Computations 会议细节:通过近似计算的节能神经网络
M. Najafi, Vidya Chabria
{"title":"Session details: Energy Efficient Neural Networks via Approximate Computations","authors":"M. Najafi, Vidya Chabria","doi":"10.1145/3578477","DOIUrl":"https://doi.org/10.1145/3578477","url":null,"abstract":"","PeriodicalId":367046,"journal":{"name":"Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design","volume":"66 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131342203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Addressing Sensor Security through Hardware/Software Co-Design 会议细节:通过硬件/软件协同设计解决传感器安全问题
M. Wolf
{"title":"Session details: Addressing Sensor Security through Hardware/Software Co-Design","authors":"M. Wolf","doi":"10.1145/3578428","DOIUrl":"https://doi.org/10.1145/3578428","url":null,"abstract":"","PeriodicalId":367046,"journal":{"name":"Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115288268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SGIRR SGIRR
Cheng-Yuan Wang, Yao W. Chang, Yuan-Hao Chang
{"title":"SGIRR","authors":"Cheng-Yuan Wang, Yao W. Chang, Yuan-Hao Chang","doi":"10.1145/3508352.3549364","DOIUrl":"https://doi.org/10.1145/3508352.3549364","url":null,"abstract":"Resistive Random Access Memory (ReRAM) Crossbars are a promising process-in-memory technology to reduce enormous data movement overheads of large-scale graph processing between computation and memory units. ReRAM cells can combine with crossbar arrays to effectively accelerate graph processing, and partitioning ReRAM crossbar arrays into Operation Units (OUs) can further improve computation accuracy of ReRAM crossbars. The operation unit utilization was not optimized in previous work, incurring extra cost. This paper proposes a two-stage algorithm with a crossbar OU-aware scheme for sparse graph index remapping for ReRAM (SGIRR) crossbars, mitigating the influence of graph sparsity. In particular, this paper is the first to consider the given operation unit size with the remapping index algorithm, optimizing the operation unit and power dissipation. Experimental results show that our proposed algorithm reduces the utilization of crossbar OUs by 31.4%, improves the total OU block usage by 10.6%, and saves energy consumption by 17.2%, on average.","PeriodicalId":367046,"journal":{"name":"Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115318975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信