2020 57th ACM/IEEE Design Automation Conference (DAC)最新文献

筛选
英文 中文
Centaur: Hybrid Processing in On/Off-chip Memory Architecture for Graph Analytics 半人马:用于图形分析的片上/片外内存架构中的混合处理
2020 57th ACM/IEEE Design Automation Conference (DAC) Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218624
Abraham Addisie, V. Bertacco
{"title":"Centaur: Hybrid Processing in On/Off-chip Memory Architecture for Graph Analytics","authors":"Abraham Addisie, V. Bertacco","doi":"10.1109/DAC18072.2020.9218624","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218624","url":null,"abstract":"The increased use of graph algorithms in diverse fields has highlighted their inefficiencies in current chip-multiprocessor (CMP) architectures, primarily due to their seemingly random-access patterns to off-chip memory. Recently, two families of solutions have been proposed: 1) solutions that offload operations generated by all vertices from the processor cores to off-chip memory; and 2) solutions that offload only operations generated by high-degree vertices to dedicated on-chip memory, while the cores continue to process the work related to the remaining vertices. Neither approach is optimal over the full range of vertex’s degrees. Thus, in this work, we propose Centaur, a novel architecture that processes operations on vertex data in on- and off-chip memory. Centaur utilizes a vertex’s degree as a proxy to determine whether to process related operations in on- or off-chip memory. Centaur manages to provide up to 4.0× improvement in performance and 3.8× in energy benefits, compared to a baseline CMP, and up to a 2.0× performance boost over state-of-the-art specialized solutions.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133736255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
CryptoPIM: In-memory Acceleration for Lattice-based Cryptographic Hardware 基于格的加密硬件的内存加速
2020 57th ACM/IEEE Design Automation Conference (DAC) Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218730
Hamid Nejatollahi, Saransh Gupta, M. Imani, T. Simunic, Rosario Cammarota, N. Dutt
{"title":"CryptoPIM: In-memory Acceleration for Lattice-based Cryptographic Hardware","authors":"Hamid Nejatollahi, Saransh Gupta, M. Imani, T. Simunic, Rosario Cammarota, N. Dutt","doi":"10.1109/DAC18072.2020.9218730","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218730","url":null,"abstract":"Quantum computers promise to solve hard mathematical problems such as integer factorization and discrete logarithms in polynomial time, making standardized public-key cryptosystems insecure. Lattice-Based Cryptography (LBC) is a promising post-quantum public key cryptographic protocol that could replace standardized public key cryptography, thanks to the inherent post-quantum resistant properties, efficiency, and versatility. A key mathematical tool in LBC is the Number Theoretic Transform (NTT), a common method to compute polynomial multiplication. It is the most compute-intensive routine and requires acceleration for practical deployment of LBC protocols. In this paper, we propose CryptoPIM, a high-throughput Processing In-Memory (PIM) accelerator for NTT-based polynomial multiplier with the support of polynomials with degrees up to 32k. Compared to the fastest FPGA implementation of an NTT-based multiplier, CryptoPIM achieves on average 31x throughput improvement with the same energy and only 28% performance reduction, thereby showing promise for practical deployment of LBC.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133527901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Developing Privacy-preserving AI Systems: The Lessons learned 开发保护隐私的人工智能系统:经验教训
2020 57th ACM/IEEE Design Automation Conference (DAC) Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218662
Huili Chen, S. Hussain, Fabian Boemer, Emmanuel Stapf, A. Sadeghi, F. Koushanfar, Rosario Cammarota
{"title":"Developing Privacy-preserving AI Systems: The Lessons learned","authors":"Huili Chen, S. Hussain, Fabian Boemer, Emmanuel Stapf, A. Sadeghi, F. Koushanfar, Rosario Cammarota","doi":"10.1109/DAC18072.2020.9218662","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218662","url":null,"abstract":"Advances in customers' data privacy laws create pressures and pain points across the entire lifecycle of AI products. Working figures such as data scientists and data engineers need to account for the correct use of privacy-enhancing technologies such as homomorphic encryption, secure multi-party computation, and trusted execution environment when they develop, test and deploy products embedding AI models while providing data protection guarantees. In this work, we share the lessons learned during the development of frameworks to aid data scientists and data engineers to map their optimized workloads onto privacy-enhancing technologies seamlessly and correctly.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132863040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Scalable Multi-FPGA Acceleration for Large RNNs with Full Parallelism Levels 具有完全并行性的大型rnn的可扩展多fpga加速
2020 57th ACM/IEEE Design Automation Conference (DAC) Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218528
Dongup Kwon, Suyeon Hur, Hamin Jang, E. Nurvitadhi, Jangwoo Kim
{"title":"Scalable Multi-FPGA Acceleration for Large RNNs with Full Parallelism Levels","authors":"Dongup Kwon, Suyeon Hur, Hamin Jang, E. Nurvitadhi, Jangwoo Kim","doi":"10.1109/DAC18072.2020.9218528","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218528","url":null,"abstract":"The increasing size of recurrent neural networks (RNNs) makes it hard to meet the growing demand for real-time AI services. For low-latency RNN serving, FPGA-based accelerators can leverage specialized architectures with optimized dataflow. However, they also suffer from severe HW under-utilization when partitioning RNNs, and thus fail to obtain the scalable performance.In this paper, we identify the performance bottlenecks of existing RNN partitioning strategies. Then, we propose a novel RNN partitioning strategy to achieve the scalable multi-FPGA acceleration for large RNNs. First, we introduce three parallelism levels and exploit them by partitioning weight matrices, matrix/vector operations, and layers. Second, we examine the performance impact of collective communications and software pipelining to derive more accurate and optimal distribution results. We prototyped an FPGA-based acceleration system using multiple Intel high-end FPGAs, and our partitioning scheme allows up to 2.4x faster inference of modern RNN workloads than conventional partitioning methods.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122881759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
CAP’NN: Class-Aware Personalized Neural Network Inference 类别感知的个性化神经网络推理
2020 57th ACM/IEEE Design Automation Conference (DAC) Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218741
Maedeh Hemmat, Joshua San Miguel, A. Davoodi
{"title":"CAP’NN: Class-Aware Personalized Neural Network Inference","authors":"Maedeh Hemmat, Joshua San Miguel, A. Davoodi","doi":"10.1109/DAC18072.2020.9218741","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218741","url":null,"abstract":"We propose CAP’NN, a framework for Class-Aware Personalized Neural Network Inference. CAP’NN prunes an already-trained neural network model based on the preferences of individual users. Specifically, by adapting to the subset of output classes that each user is expected to encounter, CAP’NN is able to prune not only ineffectual neurons but also miseffectual neurons that confuse classification, without the need to retrain the network. CAP’NN achieves up to 50% model size reduction while actually improving the top-l(5) classification accuracy by up to 2.3%(3.2%) when the user only encounters a subset of VGG-16 classes.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124186928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Romeo: Conversion and Evaluation of HDL Designs in the Encrypted Domain 罗密欧:加密领域中HDL设计的转换与评估
2020 57th ACM/IEEE Design Automation Conference (DAC) Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218579
Charles Gouert, N. G. Tsoutsos
{"title":"Romeo: Conversion and Evaluation of HDL Designs in the Encrypted Domain","authors":"Charles Gouert, N. G. Tsoutsos","doi":"10.1109/DAC18072.2020.9218579","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218579","url":null,"abstract":"As cloud computing becomes increasingly ubiquitous, protecting the confidentiality of data outsourced to third parties becomes a priority. While encryption is a natural solution to this problem, traditional algorithms may only protect data at rest and in transit, but do not support encrypted processing. In this work we introduce ROMEO, which enables easy-to-use privacy-preserving processing of data in the cloud using homomorphic encryption. ROMEO automatically converts arbitrary programs expressed in Verilog HDL into equivalent homomorphic circuits that are evaluated using encrypted inputs. For our experiments, we employ cryptographic circuits, such as AES, and benchmarks from the ISCAS’85 and ISCAS’89 suites.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116367182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
ATUNs: Modular and Scalable Support for Atomic Operations in a Shared Memory Multiprocessor atun:共享内存多处理器中原子操作的模块化和可伸缩支持
2020 57th ACM/IEEE Design Automation Conference (DAC) Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218661
Andreas Kurth, Samuel Riedel, Florian Zaruba, T. Hoefler, L. Benini
{"title":"ATUNs: Modular and Scalable Support for Atomic Operations in a Shared Memory Multiprocessor","authors":"Andreas Kurth, Samuel Riedel, Florian Zaruba, T. Hoefler, L. Benini","doi":"10.1109/DAC18072.2020.9218661","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218661","url":null,"abstract":"Atomic operations are crucial for most modern parallel and concurrent algorithms, which necessitates their optimized implementation in highly-scalable manycore processors. We pro-pose a modular and efficient, open-source ATomic UNit (ATUN) architecture that can be placed flexibly at different levels of the memory hierarchy. ATUN demonstrates near-optimal linear scaling for various synthetic and real-world workloads on an FPGA prototype with 32 RISC-V cores. We characterize the hardware complexity of our ATUN design in 22 nm FDSOI and find that it scales linearly in area (only 0.5 kGE per core) and logarithmically in the critical path.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116904053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
CL(R)Early: An Early-stage DSE Methodology for Cross-Layer Reliability-aware Heterogeneous Embedded Systems 李志强(R)早期:跨层可靠性感知异构嵌入式系统的早期DSE方法
2020 57th ACM/IEEE Design Automation Conference (DAC) Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218747
Siva Satyendra Sahoo, B. Veeravalli, Akash Kumar
{"title":"CL(R)Early: An Early-stage DSE Methodology for Cross-Layer Reliability-aware Heterogeneous Embedded Systems","authors":"Siva Satyendra Sahoo, B. Veeravalli, Akash Kumar","doi":"10.1109/DAC18072.2020.9218747","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218747","url":null,"abstract":"Cross-layer reliability (CLR) presents a cost-effective alternative to traditional single-layer design in resource-constrained embedded systems. CLR provides the scope for leveraging the inherent fault-masking of multiple layers and exploiting application-specific tolerances to degradation in some Quality of Service (QoS) metrics. However, it can also lead to an explosion in the design complexity. State-of-the art approaches to such joint optimization across multiple degrees of freedom can lead to degradation in the system-level Design Space Exploration (DSE) results. To this end, we propose a DSE methodology for enabling CLR-aware task-mapping in heterogeneous embedded systems. Specifically, we present novel approaches to both task and system-level analysis for performing an early-stage exploration of various design decisions. The proposed methodology results in considerable improvements over other state-of-the-art approaches and shows significant scaling with application size.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115209714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Cross-Layer Power and Timing Evaluation Method for Wide Voltage Scaling 宽电压标度的跨层功率和时序评估方法
2020 57th ACM/IEEE Design Automation Conference (DAC) Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218682
Wenjie Fu, Leilei Jin, Ming Ling, Yu Zheng, Longxing Shi
{"title":"A Cross-Layer Power and Timing Evaluation Method for Wide Voltage Scaling","authors":"Wenjie Fu, Leilei Jin, Ming Ling, Yu Zheng, Longxing Shi","doi":"10.1109/DAC18072.2020.9218682","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218682","url":null,"abstract":"Wide supply voltage scaling is critical to enable worthwhile dynamic adjustment of the processor efficiency against varying workloads. In this paper, a cross-layer power and timing evaluation method is proposed to estimate the processor energy efficiency using both circuit and architectural information in a wide voltage range. The process variations are considered through statistical static timing analysis while the voltage effect is modeled through secondary iterated fittings. The error for estimating processor energy efficiency decreases to 8.29% when the supply voltage is scaled from 1.1V to 0.6V, while traditional architectural evaluations behave more than 40% errors.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125327886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
ALSRAC: Approximate Logic Synthesis by Resubstitution with Approximate Care Set 基于近似关心集的近似逻辑综合
2020 57th ACM/IEEE Design Automation Conference (DAC) Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218627
Chang Meng, Weikang Qian, A. Mishchenko
{"title":"ALSRAC: Approximate Logic Synthesis by Resubstitution with Approximate Care Set","authors":"Chang Meng, Weikang Qian, A. Mishchenko","doi":"10.1109/DAC18072.2020.9218627","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218627","url":null,"abstract":"Approximate computing is an emerging design technique for error-resilient applications. It improves circuit area, power, and delay at the cost of introducing some errors. Approximate logic synthesis (ALS) is an automatic process to produce approximate circuits. This paper proposes approximate resubstitution with approximate care set and uses it to build a simulation-based ALS flow. The experimental results demonstrate that the proposed method saves 7%–18% area compared to state-of-the-art methods. The code of ALSRAC is made open-source.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126910666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信