2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC)最新文献_第2页

Invited: Cross-layer approximate computing: From logic to architectures 邀请:跨层近似计算:从逻辑到架构

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) Pub Date : 2016-06-05 DOI: 10.1145/2897937.2906199

M. Shafique, R. Hafiz, Semeen Rehman, Walaa El-Harouni, J. Henkel

{"title":"Invited: Cross-layer approximate computing: From logic to architectures","authors":"M. Shafique, R. Hafiz, Semeen Rehman, Walaa El-Harouni, J. Henkel","doi":"10.1145/2897937.2906199","DOIUrl":"https://doi.org/10.1145/2897937.2906199","url":null,"abstract":"We present a survey of approximate techniques and discuss concepts for building power-/energy-efficient computing components reaching from approximate accelerators to arithmetic blocks (like adders and multipliers). We provide a systematical understanding of how to generate and explore the design space of approximate components, which enables a wide-range of power/energy, performance, area and output quality tradeoffs, and a high degree of design flexibility to facilitate their design. To enable cross-layer approximate computing, bridging the gap between the logic layer (i.e. arithmetic blocks) and the architecture layer (and even considering the software layers) is crucial. Towards this end, this paper introduces open-source libraries of low-power and high-performance approximate components. The elementary approximate arithmetic blocks (adder and multiplier) are used to develop multi-bit approximate arithmetic blocks and accelerators. An analysis of data-driven resilience and error propagation is discussed. The approximate computing components are a first steps towards a systematic approach to introduce approximate computing paradigms at all levels of abstractions.","PeriodicalId":185271,"journal":{"name":"2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123205250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 173

A model-driven approach to warp/thread-block level GPU cache bypassing 一个模型驱动的方法，曲/线程块级GPU缓存绕过

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) Pub Date : 2016-06-05 DOI: 10.1145/2897937.2897966

Hongwen Dai, C. Li, Huiyang Zhou, Saurabh Gupta, Christos Kartsaklis, Mike Mantor

引用次数: 19

TEMP: Thread batch enabled memory partitioning for GPU 为GPU启用线程批处理内存分区

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) Pub Date : 2016-06-05 DOI: 10.1145/2897937.2898103

Mengjie Mao, Wujie Wen, Xiaoxiao Liu, J. Hu, Danghui Wang, Yiran Chen, Hai Helen Li

引用次数: 8

Probabilistic bug-masking analysis for post-silicon tests in microprocessor verification 微处理器验证中后硅测试的概率bug屏蔽分析

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) Pub Date : 2016-06-05 DOI: 10.1145/2897937.2898072

Doowon Lee, Tom Kolan, A. Morgenshtein, V. Sokhin, Ronny Morad, A. Ziv, V. Bertacco

{"title":"Probabilistic bug-masking analysis for post-silicon tests in microprocessor verification","authors":"Doowon Lee, Tom Kolan, A. Morgenshtein, V. Sokhin, Ronny Morad, A. Ziv, V. Bertacco","doi":"10.1145/2897937.2898072","DOIUrl":"https://doi.org/10.1145/2897937.2898072","url":null,"abstract":"Post-silicon validation has become essential in catching hard-to-detect, rarely-occurring bugs that have slipped through pre-silicon verification. Post-silicon validation flows, however, are challenged by limited signal observability, which impacts their ability of diagnosing and detecting bugs. Indeed, bug manifestations during the execution of constrained-random tests may be masked and be unobservable from the test's outputs. The ability to evaluate the bug-masking rate of a test provides great value in generating and/or selecting effective tests for high coverage regressions. To this end, we propose an efficient, static bug-masking analysis solution, called BugMAPI. BugMAPI tracks the information flow in a test program, and it estimates the probability that bugs go undetected by the checking mechanisms in place in the post-silicon platform. To achieve this goal, we leverage static code analysis and a novel, lightweight, probability estimation algorithm. We evaluated BugMAPI on a range of industrial constrained-random tests and a range of bug injection models, and we found that it can estimate bugmasking rates with an accuracy of 77% in 3 orders-of-magnitude less time, compared to an ideal dynamic analysis solution.","PeriodicalId":185271,"journal":{"name":"2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114767576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Quest for high-performance bufferless NoCs with single-cycle express paths and self-learning throttling 探索具有单周期表达路径和自学习节流的高性能无缓冲noc

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) Pub Date : 2016-06-05 DOI: 10.1145/2897937.2898075

Bhavya K. Daya, L. Peh, A. Chandrakasan

引用次数: 21

Invited — A box of dots: Using scan-based path delay test for timing verification 邀请-一盒点:使用基于扫描的路径延迟测试进行时序验证

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) Pub Date : 2016-06-05 DOI: 10.1145/2897937.2905001

A. Crouch, John C. Potter

引用次数: 4

Invited: Towards fail-operational Ethernet based in-vehicle networks 邀请:基于故障操作以太网的车载网络

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) Pub Date : 2016-06-05 DOI: 10.1145/2897937.2905021

Mischa Möstl, Daniel Thiele, R. Ernst

引用次数: 13

Architecting energy-efficient STT-RAM based register file on GPGPUs via delta compression 通过增量压缩在gpgpu上构建基于STT-RAM的节能寄存器文件

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) Pub Date : 2016-06-05 DOI: 10.1145/2897937.2897989

Hang Zhang, Xuhao Chen, Nong Xiao, Fang Liu

{"title":"Architecting energy-efficient STT-RAM based register file on GPGPUs via delta compression","authors":"Hang Zhang, Xuhao Chen, Nong Xiao, Fang Liu","doi":"10.1145/2897937.2897989","DOIUrl":"https://doi.org/10.1145/2897937.2897989","url":null,"abstract":"To facilitate efficient context switches, GPUs usually employ a large-capacity register file to accommodate a massive amount of context information. However, the large register file introduces high power consumption, owing to high leakage power SRAM cells. Emerging non-volatile STT-RAM memory has recently been studied as a potential replacement to alleviate the leakage challenge when constructing register files on GPUs. Unfortunately, due to the long write latency and high energy consumption associated with write operations in STT-RAM, simply replacing SRAM with STT-RAM for register files would incur non-trivial performance overhead and only bring marginal energy benefits. In this paper, we propose to optimize STT-RAM based GPU register files for better energy-efficiency and performance via two techniques. First, we employ a light-weight compression framework with awareness of register value similarity. It is coupled with a group-based write driver control to mitigate the high energy overhead caused by STT-RAM writes. Second, to address the long write latency overhead of STT-RAM, we propose a centralized SRAM-based write buffer design to efficiently absorb STT-RAM writes with better buffer utilization, rather than the conventional design with distributed per-bank based write buffers. The experimental results show that our STT-RAM based register file design consumes only 37.4% energy over the SRAM baseline, while incurring only negligible performance degradation.","PeriodicalId":185271,"journal":{"name":"2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134449739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

High-level synthesis for micro-electrode-dot-array digital microfluidic biochips 微电极点阵列数字微流控生物芯片的高水平合成

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) Pub Date : 2016-06-05 DOI: 10.1145/2897937.2898028

Zipeng Li, Kelvin Yi-Tse Lai, Po-Hsien Yu, Tsung-Yi Ho, K. Chakrabarty, Chen-Yi Lee

{"title":"High-level synthesis for micro-electrode-dot-array digital microfluidic biochips","authors":"Zipeng Li, Kelvin Yi-Tse Lai, Po-Hsien Yu, Tsung-Yi Ho, K. Chakrabarty, Chen-Yi Lee","doi":"10.1145/2897937.2898028","DOIUrl":"https://doi.org/10.1145/2897937.2898028","url":null,"abstract":"A digital microfluidic biochip (DMFB) is an attractive technology platform for automating laboratory procedures in biochemistry. However, today's DMFBs suffer from several limitations: (i) constraints on droplet size and the inability to vary droplet volume in a fine-grained manner; (ii) the lack of integrated sensors for real-time detection; (iii) the need for special fabrication processes and reliability/yield concerns. To overcome the above problems, DMFBs based on a micro-electrode-dot-array (MEDA) architecture have recently been demonstrated. However, due to the inherent differences between today's DMFBs and MEDA, existing synthesis solutions cannot be utilized for MEDA-based biochips. We present the first biochip synthesis approach that can be used for MEDA. The proposed synthesis method targets operation scheduling, module placement, routing of droplets of various sizes, and diagonal movement of droplets in a two-dimensional array. Simulation results using benchmarks and experimental results using a fabricated MEDA biochip demonstrate the effectiveness of the proposed co-optimization technique.","PeriodicalId":185271,"journal":{"name":"2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131813623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

Spectral graph sparsification in nearly-linear time leveraging efficient spectral perturbation analysis 利用有效的谱摄动分析，在近线性时间内实现谱图稀疏化

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) Pub Date : 2016-06-05 DOI: 10.1145/2897937.2898094

Zhuo Feng

{"title":"Spectral graph sparsification in nearly-linear time leveraging efficient spectral perturbation analysis","authors":"Zhuo Feng","doi":"10.1145/2897937.2898094","DOIUrl":"https://doi.org/10.1145/2897937.2898094","url":null,"abstract":"Spectral graph sparsification aims to find an ultra-sparse subgraph whose Laplacian matrix can well approximate the original Laplacian matrix in terms of its eigenvalues and eigenvectors. The resultant sparsified subgraph can be efficiently leveraged as a proxy in a variety of numerical computation applications and graph-based algorithms. This paper introduces a practically efficient, nearly-linear time spectral graph sparsification algorithm that can immediately lead to the development of nearly-linear time symmetric diagonally-dominant (SDD) matrix solvers. Our spectral graph sparsi-fication algorithm can efficiently build an ultra-sparse subgraph from a spanning tree subgraph by adding a few “spectrally-critical” off-tree edges back to the spanning tree, which is enabled by a novel spectral perturbation approach and allows to approximately preserve key spectral properties of the original graph Laplacian. Extensive experimental results confirm the nearly-linear runtime scalability of an SDD matrix solver for large-scale, real-world problems, such as VLSI, thermal and finite-element analysis problems, etc. For instance, a sparse SDD matrix with 40 million unknowns and 180 million nonzeros can be solved (1E-3 accuracy level) within two minutes using a single CPU core and about 6GB memory.","PeriodicalId":185271,"journal":{"name":"2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131121139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27