2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)最新文献

筛选
英文 中文
A Write-friendly Arithmetic Coding Scheme for Achieving Energy-Efficient Non-Volatile Memory Systems 实现高能效非易失性存储系统的写友好算术编码方案
2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2021-01-18 DOI: 10.1145/3394885.3431511
Yi-Shen Chen, Chun-Feng Wu, Yuan-Hao Chang, Tei-Wei Kuo
{"title":"A Write-friendly Arithmetic Coding Scheme for Achieving Energy-Efficient Non-Volatile Memory Systems","authors":"Yi-Shen Chen, Chun-Feng Wu, Yuan-Hao Chang, Tei-Wei Kuo","doi":"10.1145/3394885.3431511","DOIUrl":"https://doi.org/10.1145/3394885.3431511","url":null,"abstract":"In the era of the Internet of Things (IoT), wearable IoT devices become popular and closely related to our life. Most of these devices are based on the embedded systems that have to operate on limited energy resources, such as batteries or energy harvesters. Therefore, energy efficiency is one of the critical issues for these devices. To relieve the energy consumption by reducing the total accesses on memory and storage layers, the technologies of storage-class memory (SCM) and data compression techniques are applied to eliminate the data movements and squeeze the data size, respectively. However, the information gap between them hinders the cooperation among the two techniques for achieving further optimizations on minimizing energy consumption. This work proposes a write-friendly arithmetic coding with joint managing both techniques to achieve energy-efficient non-volatile memory (NVM) systems. In particular, the concept of “ignorable bits” is introduced to further skip the write operations while storing the compressed data into SCM devices. The proposed design was evaluated by a series of intensive experiments, and the results are encouraging.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133369410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
28GHz Phase Shifter with Temperature Compensation for 5G NR Phased-array Transceiver 5G NR相控阵收发器的温度补偿28GHz移相器
2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2021-01-18 DOI: 10.1145/3394885.3431650
Yi Zhang, Jian Pang, Kiyoshi Yanagizawa, A. Shirane, K. Okada
{"title":"28GHz Phase Shifter with Temperature Compensation for 5G NR Phased-array Transceiver","authors":"Yi Zhang, Jian Pang, Kiyoshi Yanagizawa, A. Shirane, K. Okada","doi":"10.1145/3394885.3431650","DOIUrl":"https://doi.org/10.1145/3394885.3431650","url":null,"abstract":"A phase shifter with temperature compensation for 28GHz phased-array TRX is presented. A precise low-voltage current reference is proposed for the IDAC biasing circuit. The total gain variation for a single TX path including phase shifter and post stage amplifiers over -40°C to 80°C is only 1dB in measurement and the overall phase error due to temperature is less than 1 degree without off-chip calibration.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129502207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators 基于reram的DNN推理加速器的混合精度量化
2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2021-01-18 DOI: 10.1145/3394885.3431554
Sitao Huang, Aayush Ankit, P. Silveira, Rodrigo Antunes, S. R. Chalamalasetti, I. E. Hajj, Dong Eun Kim, G. Aguiar, P. Bruel, S. Serebryakov, Cong Xu, Can Li, P. Faraboschi, J. Strachan, Deming Chen, K. Roy, Wen-mei W. Hwu, D. Milojicic
{"title":"Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators","authors":"Sitao Huang, Aayush Ankit, P. Silveira, Rodrigo Antunes, S. R. Chalamalasetti, I. E. Hajj, Dong Eun Kim, G. Aguiar, P. Bruel, S. Serebryakov, Cong Xu, Can Li, P. Faraboschi, J. Strachan, Deming Chen, K. Roy, Wen-mei W. Hwu, D. Milojicic","doi":"10.1145/3394885.3431554","DOIUrl":"https://doi.org/10.1145/3394885.3431554","url":null,"abstract":"ReRAM-based accelerators have shown great potential for accelerating DNN inference because ReRAM crossbars can perform analog matrix-vector multiplication operations with low latency and energy consumption. However, these crossbars require the use of ADCs which constitute a significant fraction of the cost of MVM operations. The overhead of ADCs can be mitigated via partial sum quantization. However, prior quantization flows for DNN inference accelerators do not consider partial sum quantization which is not highly relevant to traditional digital architectures. To address this issue, we propose a mixed precision quantization scheme for ReRAM-based DNN inference accelerators where weight quantization, input quantization, and partial sum quantization are jointly applied for each DNN layer. We also propose an automated quantization flow powered by deep reinforcement learning to search for the best quantization configuration in the large design space. Our evaluation shows that the proposed mixed precision quantization scheme and quantization flow reduce inference latency and energy consumption by up to 3.89× and 4.84×, respectively, while only losing 1.18% in DNN inference accuracy.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115053811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Dynamic Neural Network to Enable Run-Time Trade-off between Accuracy and Latency 动态神经网络实现运行时精度和延迟之间的权衡
2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2021-01-18 DOI: 10.1145/3394885.3431628
Li Yang, Deliang Fan
{"title":"Dynamic Neural Network to Enable Run-Time Trade-off between Accuracy and Latency","authors":"Li Yang, Deliang Fan","doi":"10.1145/3394885.3431628","DOIUrl":"https://doi.org/10.1145/3394885.3431628","url":null,"abstract":"To deploy powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN to reduce the network size and computation complexity with negligible accuracy degradation, such as weight quantization, network pruning, convolution decomposition, etc. However, by utilizing conventional DNN compression methods, a smaller, but fixed, network is generated from a relative large background model to achieve resource limited hardware acceleration. However, such optimization lacks the ability to adjust its structure in real-time to adapt for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review our two prior works [13], [15] to tackle this challenge, discussing how to construct a dynamic DNN by means of either uniform or non-uniform sub-nets generation methods. Moreover, to generate multiple non-uniform sub-nets, [15] needs to fully retrain the background model for each sub-net individually, named as multi-path method. To reduce the training cost, in this work, we further propose a single-path sub-nets generation method that can sample multiple sub-nets in different epochs within one training round. The constructed dynamic DNN, consisting of multiple sub-nets, provides the ability to run-time trade-off the inference accuracy and latency according to hardware resources and environment requirements. In the end, we study the the dynamic DNNs with different sub-nets generation methods on both CIFAR-10 and ImageNet dataset. We also present the run-time tuning of accuracy and latency on both GPU and CPU.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115284409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Novel Technology Mapper for Complex Universal Gates 复杂通用门的新技术映射器
2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2021-01-18 DOI: 10.1145/3394885.3431561
Meng-Che Wu, A. Dao, Mark Po-Hung Lin
{"title":"A Novel Technology Mapper for Complex Universal Gates","authors":"Meng-Che Wu, A. Dao, Mark Po-Hung Lin","doi":"10.1145/3394885.3431561","DOIUrl":"https://doi.org/10.1145/3394885.3431561","url":null,"abstract":"Complex universal logic gates, which may have higher density and flexibility than basic logic gates and look-up tables (LUT), are useful for cost-effective or security-oriented VLSI design requirements. However, most of the technology mapping algorithms aim to optimize combinational logic with basic standard cells or LUT components. It is desirable to investigate optimal technology mappers for complex universal gates in addition to basic standard cells and LUT components. This paper proposes a novel technology mapper for complex universal gates with a tight integration of the following techniques: Boolean network simulation with permutation classification, supergate library construction, dynamic programming based cut enumeration, Boolean matching with optimal universal cell covering. Experimental results show that the proposed method outperforms the state-of-the-art technology mapper in ABC, in terms of both area and delay.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123639804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combining Memory Partitioning and Subtask Generation for Parallel Data Access on CGRAs 结合内存分区和子任务生成的CGRAs并行数据访问
2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2021-01-18 DOI: 10.1145/3394885.3431414
Cheng Li, Jiangyuan Gu, S. Yin, Leibo Liu, Shaojun Wei
{"title":"Combining Memory Partitioning and Subtask Generation for Parallel Data Access on CGRAs","authors":"Cheng Li, Jiangyuan Gu, S. Yin, Leibo Liu, Shaojun Wei","doi":"10.1145/3394885.3431414","DOIUrl":"https://doi.org/10.1145/3394885.3431414","url":null,"abstract":"Coarse-Grained Reconfigurable Architectures (CGRAs) are attractive reconfigurable platforms with the advantages of high performance and power efficiency. In a CGRA based computing system, the computations are often mapped onto the CGRA with parallel memory accesses. To fully exploit the on-chip memory bandwidth, memory partitioning algorithms are widely used to reduce access conflicts. CGRAs have a fixed storage fabric and limited size memory due to the severe area constraints. Previous memory partitioning algorithms assumed that data could be completely transferred into the target memory. However, in practice, we often encounter situations where on-chip storage is insufficient to store the complete data. In order to perform the computation of these applications in the memory-limited CGRA, we first develop a memory partitioning strategy with continual placement, which can also avoid data preprocessing, and then divide the kernel into multiple subtasks that suit the size of the target memory. Experimental results show that, compared to the state-of-the-art method, our approach achieves a 43.2% reduction in data preparation time and an 18.5% improvement in overall performance. If the subtask generation scheme is adopted, our approach can achieve a 14.4% overall performance improvement while reducing memory requirements by 99.7%.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123648399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Standard Cell Routing with Reinforcement Learning and Genetic Algorithm in Advanced Technology Nodes 先进技术节点中基于强化学习和遗传算法的标准单元路由
2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2021-01-18 DOI: 10.1145/3394885.3431569
Haoxing Ren, Matthew R. Fojtik
{"title":"Standard Cell Routing with Reinforcement Learning and Genetic Algorithm in Advanced Technology Nodes","authors":"Haoxing Ren, Matthew R. Fojtik","doi":"10.1145/3394885.3431569","DOIUrl":"https://doi.org/10.1145/3394885.3431569","url":null,"abstract":"Standard cell layout in advanced technology nodes are done manually in the industry today. Automating standard cell layout process, in particular the routing step, are challenging because of the constraints of enormous design rules. In this paper we propose a machine learning based approach that applies genetic algorithm to create initial routing candidates and uses reinforcement learning (RL) to fix the design rule violations incrementally. A design rule checker feedbacks the violations to the RL agent and the agent learns how to fix them based on the data. This approach is also applicable to future technology nodes with unseen design rules. We demonstrate the effectiveness of this approach on a number of standard cells. We have shown that it can route a cell which is deemed unroutable manually, reducing the cell size by 11%.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121050297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Hierarchical Assessment Strategy on Soft Error Propagation in Deep Learning Controller 深度学习控制器软误差传播的分层评估策略
2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2021-01-18 DOI: 10.1145/3394885.3431573
Ting Liu, Yuzhuo Fu, Yan Zhang, Bin Shi
{"title":"A Hierarchical Assessment Strategy on Soft Error Propagation in Deep Learning Controller","authors":"Ting Liu, Yuzhuo Fu, Yan Zhang, Bin Shi","doi":"10.1145/3394885.3431573","DOIUrl":"https://doi.org/10.1145/3394885.3431573","url":null,"abstract":"Deep learning techniques have been introduced into the field of intelligent controller design in recent years and become an effective alternative in complex control scenarios. In addition to improve control robustness, deep learning controllers (DLCs) also provide a potential fault tolerance to internal disturbances (such as soft errors) due to the inherent redundant structure of deep neural networks (DNNs). In this paper, we propose a hierarchical assessment to characterize the impact of soft errors on the dependability of a PID controller and its DLC alternative. Single-bit-flip injections in underlying hardware and time series data collection from multiple abstraction layers (ALs) are performed on a virtual prototype system based on an ARM Cortex-A9 CPU, with a PID controller and corresponding recurrent neural network (RNN) implemented DLC deployed on it. We employ generative adversarial networks and Bayesian networks to characterize the local and global dependencies caused by soft errors across the system. By analyzing cross-AL fault propagation paths and component sensitivities, we discover that the parallel data processing pipelines and regular feature size scaling mechanism in DLC can effectively prevent critical failure causing faults from propagating to the control output.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125802615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Zero Correlation Error: A Metric for Finite-Length Bitstream Independence in Stochastic Computing 零相关误差:随机计算中有限长度比特流独立性的度量
2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2021-01-18 DOI: 10.1145/3394885.3431552
Hsuan Hsiao, Joshua San Miguel, Yuko Hara-Azumi, J. Anderson
{"title":"Zero Correlation Error: A Metric for Finite-Length Bitstream Independence in Stochastic Computing","authors":"Hsuan Hsiao, Joshua San Miguel, Yuko Hara-Azumi, J. Anderson","doi":"10.1145/3394885.3431552","DOIUrl":"https://doi.org/10.1145/3394885.3431552","url":null,"abstract":"Stochastic computing (SC), with its probabilistic data representation format, has sparked renewed interest due to its ability to use very simple circuits to implement complex operations. Though unlike traditional binary computing, SC needs to carefully handle correlations that exist across data values to avoid the risk of unacceptably inaccurate results. With many SC circuits designed to operate under the assumption that input values are independent, it is important to provide the ability to accurately measure and characterize independence of SC bitstreams. We propose zero correlation error (ZCE), a metric that quantifies how independent two finite-length bitstreams are, and show that it addresses fundamental limitations in metrics currently used by the SC community. Through evaluation at both the functional unit level and application level, we demonstrate how ZCE can be an effective tool for analyzing SC bitstreams, simulating circuits and design space exploration.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126015634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
One-pass Synthesis for Field-coupled Nanocomputing Technologies 场耦合纳米计算技术的一次合成
2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2021-01-18 DOI: 10.1145/3394885.3431607
Marcel Walter, Winston Haaswijk, R. Wille, F. Sill, R. Drechsler
{"title":"One-pass Synthesis for Field-coupled Nanocomputing Technologies","authors":"Marcel Walter, Winston Haaswijk, R. Wille, F. Sill, R. Drechsler","doi":"10.1145/3394885.3431607","DOIUrl":"https://doi.org/10.1145/3394885.3431607","url":null,"abstract":"Field-coupled Nanocomputing (FCN) is a class of post-CMOS emerging technologies, which promises to overcome certain physical limitations of conventional solutions such as CMOS by allowing for high computational throughput with low power dissipation. Despite their promises, the design of corresponding FCN circuits is still in its infancy. In fact, state-of-the-art solutions still heavily rely on conventional synthesis approaches that do not take the tight physical constraints of FCN circuits (particularly with respect to routability and clocking) into account. Instead, physical design is conducted in a second step in which a classical logic network is mapped onto an FCN layout. Using this two-stage approach with a classical and FCN-oblivious logic network as an intermediate result, frequently leads to substantial quality loss or completely impractical results. In this work, we propose a one-pass synthesis scheme for FCN circuits, which conducts both steps, synthesis and physical design, in a single run. For the first time, this allows to generate exact, i. e., minimal FCN circuits for a given functionality.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129052599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信