Sayandeep Sanyal, Aritra Hazra, P. Dasgupta, Scott Morrison, S. Surendran, Lakshmanan Balasubramanian
{"title":"The Notion of Cross Coverage in AMS Design Verification","authors":"Sayandeep Sanyal, Aritra Hazra, P. Dasgupta, Scott Morrison, S. Surendran, Lakshmanan Balasubramanian","doi":"10.1109/ASP-DAC47756.2020.9045131","DOIUrl":"https://doi.org/10.1109/ASP-DAC47756.2020.9045131","url":null,"abstract":"Coverage monitoring is fundamental to design verification. Coverage artifacts are well developed for digital integrated circuits and these aim to cover the discrete state space and logical behaviors of the design. Analog designers are similarly concerned with the operating regions of the design and its response to an infinite and dense input space. Analog variables can influence each other in far more complex ways as compared to digital variables, consequently, the notion of cross coverage, as introduced in the analog context for the first time in this paper, is of high importance in analog design verification. This paper presents the formal syntax and semantics of analog cross coverage artifacts, the methods for evaluating them using our tool kit, and most importantly, the insights that can be gained from such cross coverage analysis.","PeriodicalId":125112,"journal":{"name":"2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129880571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mian Qin, Joo Hwan Lee, Rekha Pitchumani, Y. Ki, N. Reddy, Paul V. Gratz
{"title":"A Generic FPGA Accelerator for Minimum Storage Regenerating Codes","authors":"Mian Qin, Joo Hwan Lee, Rekha Pitchumani, Y. Ki, N. Reddy, Paul V. Gratz","doi":"10.1109/ASP-DAC47756.2020.9045125","DOIUrl":"https://doi.org/10.1109/ASP-DAC47756.2020.9045125","url":null,"abstract":"Erasure coding is widely used in storage systems to achieve fault tolerance while minimizing the storage overhead. Recently, Minimum Storage Regenerating (MSR) codes are emerging to minimize repair bandwidth while maintaining the storage efficiency. Traditionally, erasure coding is implemented in the storage software stacks, which hinders normal operations and blocks resources that could be serving other user needs due to poor cache performance and costs high CPU and memory utilizations. In this paper, we propose a generic FPGA accelerator for MSR codes encoding/decoding which maximizes the computation parallelism and minimizes the data movement between off-chip DRAM and the on-chip SRAM buffers. To demonstrate the efficiency of our proposed accelerator, we implemented the encoding/decoding algorithms for a specific MSR code called Zigzag code on Xilinx VCU1525 acceleration card. Our evaluation shows our proposed accelerator can achieve ~2.4-3.1x better throughput and ~4.2-5.7x better power efficiency compared to the state-of-art multi-core CPU implementation and ~2.8-3.3x better throughput and ~4.2-5.3x better power efficiency compared to a modern GPU accelerator.","PeriodicalId":125112,"journal":{"name":"2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130388981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reika Kinoshita, C. Matsui, Atsuya Suzuki, S. Fukuyama, K. Takeuchi
{"title":"Workload-aware Data-eviction Self-adjusting System of Multi-SCM Storage to Resolve Trade-off between SCM Data-retention Error and Storage System Performance","authors":"Reika Kinoshita, C. Matsui, Atsuya Suzuki, S. Fukuyama, K. Takeuchi","doi":"10.1109/ASP-DAC47756.2020.9045469","DOIUrl":"https://doi.org/10.1109/ASP-DAC47756.2020.9045469","url":null,"abstract":"Storage Class Memories (SCMs) are used as non-volatile (NV) cache memory as well as storage. Multi-SCM storage with two types of SCMs, M-SCM (fast but small capacity memory-type SCM) and S-SCM (slow but large capacity storage-type SCM), has been proposed. In Multi-SCM storage, M-SCM works as NV-cache of S-SCM based storage. M-SCM such as MRAM is fast but may suffer from thermal instabilities and cause data-retention errors at high temperature. Therefore, data in M-SCM should be evicted to S-SCM at short interval before exceeding acceptable data-retention time. However, in case of short interval eviction, frequent data eviction from M-SCM to S-SCM severely degrades the storage system performance. To resolve this trade-off between data-retention reliability and the storage system performance, this paper proposes workload-aware data-eviction self-adjusting system. Proposed system is composed of Access Frequency Monitor (Proposal 1) and Evict Interval Adjustment (Proposal 2). Proposal 1 observes the access frequency of evicted data that directly affects data-retention time of M-SCM. By referring to the results of Proposal 1, Proposal 2 automatically changes the data-eviction interval so that long retention data are moved immediately to S-SCM and the storage system performance can be improved. As a result, maximum data-retention time of M-SCM decreases by 83%, and the storage system performance increases by 5.9 times. Moreover, the acceptable endurance increases by 103 times. Finally, measured data-retention errors and memory cell area decrease by 79% and 5.7%, respectively.","PeriodicalId":125112,"journal":{"name":"2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124301686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Grace Li Zhang, M. Brunner, Bing Li, G. Sigl, Ulf Schlichtmann
{"title":"Timing Resilience for Efficient and Secure Circuits","authors":"Grace Li Zhang, M. Brunner, Bing Li, G. Sigl, Ulf Schlichtmann","doi":"10.1109/ASP-DAC47756.2020.9045352","DOIUrl":"https://doi.org/10.1109/ASP-DAC47756.2020.9045352","url":null,"abstract":"In this paper, we will cover several techniques that can enhance the resilience of timing of digital circuits. Using post-silicon tuning components, the clock arrival times at flip-flops can be modified after manufacturing to balance delays between flip-flops. The actual delay properties of flip-flops will be examined to exploit the natural flexibility of such components. Wave-pipelining paths spanning several flip-flop stages can be integrated into a synchronous design to improve the circuit performance and to reduce area. In addition, with this technique, it cannot be taken for granted anymore that all the combinational paths in a circuit work with respect to one clock period. Therefore, a netlist alone does not represent all the design information. This feature enables the potential to embed wave-pipelining paths into a circuit to increase the complexity of reverse engineering. In order to replicate a design, attackers therefore have to identify the locations of the wave-pipelining paths, in addition to the netlist extracted from reverse engineering. Therefore, the security of the circuit against counterfeiting can be improved.","PeriodicalId":125112,"journal":{"name":"2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124751259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Graphical Models with Bayesian Learning and MCMC for Failure Diagnosis","authors":"Hongfei Wang, Wenjie Cai, Jianwen Li, Kun He","doi":"10.1109/ASP-DAC47756.2020.9045154","DOIUrl":"https://doi.org/10.1109/ASP-DAC47756.2020.9045154","url":null,"abstract":"Graphical models are powerful machine learning techniques for data analytics. Being capable of statistical reasoning and probabilistic inference, graphical models have the advantages of encoding prior knowledges into the learning procedure, and producing explainable models that can be understood and effectively tuned. In this work, we describe our exploration on the frontier of using graphical models for improving circuit diagnosis results. A statistical framework has been proposed for this aim, which builds Bayesian inference models using directed chain graphs, and structural learning models using undirected tree graphs. As a generative model, the framework integrates Markov chain Monte Carlo (MCMC) algorithm for sampling to evaluate the quality of diagnostic results. It exploits maximum-likelihood to estimate the underlying defect types, which can be informative towards the possible follow-up failure analysis. Five circuit examples demonstrate that the proposed framework achieves the same or better results over a state-of-the-art work. Moreover, our method also shows opportunities for dealing with missing features and locating root causes.","PeriodicalId":125112,"journal":{"name":"2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"547 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123374975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yen-Ting Chen, Ming-Chang Yang, Yuan-Hao Chang, W. Shih
{"title":"Parallel-Log-Single-Compaction-Tree: Flash-Friendly Two-Level Key-Value Management in KVSSDs","authors":"Yen-Ting Chen, Ming-Chang Yang, Yuan-Hao Chang, W. Shih","doi":"10.1109/ASP-DAC47756.2020.9045144","DOIUrl":"https://doi.org/10.1109/ASP-DAC47756.2020.9045144","url":null,"abstract":"Log-Structured Merge-Tree (LSM-tree) based key-value store applications have gained popularity due to their high write performance. To further pursue better performance for key-value applications, various researches were conducted by adopting different architectures of flash devices, such as key-value solid-state drives (KVSSDs). However, since LSM-trees were originally designed based on the architecture of hard disk drives (HDDs), true potential of SSDs can not be well exploited without re-designing the management strategy. In this work, we propose Parallel-Log-Single-Compaction-Tree (PLSC-tree), which is a two-level and flash-friendly key-value management strategy specially tailored for KVSSDs. In particular, the first layer takes advantage of the massive internal parallelism of SSDs for maximizing the write performance via logging, while the second layer is designed to alleviate the internal recycling (i.e., compaction) overheads of flash devices for ultimately optimizing the performance on managing key-value pairs. A series of experiments were conducted based on a well-known SSD simulator with realistic workloads, and the results are very encouraging.","PeriodicalId":125112,"journal":{"name":"2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"200 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123020267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeongwoo Heo, Kwangok Jeong, Taewhan Kim, Kyumyung Choi
{"title":"Synthesis of Hardware Performance Monitoring and Prediction Flow Adapting to Near-Threshold Computing and Advanced Process Nodes","authors":"Jeongwoo Heo, Kwangok Jeong, Taewhan Kim, Kyumyung Choi","doi":"10.1109/ASP-DAC47756.2020.9045392","DOIUrl":"https://doi.org/10.1109/ASP-DAC47756.2020.9045392","url":null,"abstract":"An elaborate hardware performance monitor (HPM) has become increasingly important for handling huge performance variation of near-threshold computing and recent process technologies. In this paper, we propose a new approach to the problem of predicting critical path delays (CPDs) using HPM. Precisely, for a target circuit or system, we formulate the problem of finding an efficient combination of ring oscillators (ROs) for accurate prediction of CPDs on the circuit as a mixed integer second-order cone programming and propose a method of minimizing the total number of ROs for a given pessimism level of prediction. Then, we propose a prediction flow of CPDs through statistical estimation of process parameters from measurements of the customized HPM and machine learning based delay mapping from the estimation. For a set of benchmark circuits tested using 28nm PDK and 0. 6V operation, it is shown that our approach is very effective, reducing the pessimism of CPDs and minimum supply voltages by 6.7$sim$52.9% and 20.6$sim$50.8% over those of conventional approaches, respectively.","PeriodicalId":125112,"journal":{"name":"2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"27 50","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132709253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kento Hasegawa, Ryota Ishikawa, M. Nishizawa, Kazushi Kawamura, Masashi Tawada, N. Togawa
{"title":"FPGA-based Heterogeneous Solver for Three-Dimensional Routing","authors":"Kento Hasegawa, Ryota Ishikawa, M. Nishizawa, Kazushi Kawamura, Masashi Tawada, N. Togawa","doi":"10.1109/ASP-DAC47756.2020.9045660","DOIUrl":"https://doi.org/10.1109/ASP-DAC47756.2020.9045660","url":null,"abstract":"A heuristic algorithm is one of the approaches to solve an NP-hard problem. In order to enhance the capability of the system, heterogeneous computing is often adapted. In this paper, we propose an FPGA-based heterogeneous solver for three-dimensional routing. The proposed system is implemented into multiple FPGA boards and a single-board computer. The experimental results demonstrate that the proposed system outperforms a single FPGA system.","PeriodicalId":125112,"journal":{"name":"2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":" 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134480093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"[Copyright notice]","authors":"","doi":"10.1109/asp-dac47756.2020.9045377","DOIUrl":"https://doi.org/10.1109/asp-dac47756.2020.9045377","url":null,"abstract":"","PeriodicalId":125112,"journal":{"name":"2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115492079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LeAp: Leading-one Detection-based Softcore Approximate Multipliers with Tunable Accuracy","authors":"Zahra Ebrahimi, Salim Ullah, Akash Kumar","doi":"10.1109/ASP-DAC47756.2020.9045171","DOIUrl":"https://doi.org/10.1109/ASP-DAC47756.2020.9045171","url":null,"abstract":"Approximate multipliers are ubiquitously used in diverse applications by exploiting circuit simplification, mainly specialized for Application-Specific Integrated Circuit (ASIC) platforms. However, the intrinsic architectural specifications of Field-Programmable Gate Arrays (FPGAs) prohibited comparable resource gains when directly applying these techniques. LeAp is an area-, throughput-, and energy-efficient approximate multiplier for FPGAs which efficiently utilizes 6-input Look-up Tables (6-LUTs) and fast carry chains in its novel approximate log calculator to implement Mitchell’s algorithm. Moreover, three novel error-refinement schemes with negligible area overhead and independent from multiplier-size, have boosted accuracy to $gt 99$%. Experimental results obtained from Vivado, Artificial Neural Network (ANN) and image processing applications indicate superiority of proposed multiplier over accurate and state-of-the-art approximate counterparts. In particular, LeAp outperforms the 32x32 accurate multiplier by achieving 69.7%, 14.7%, 42.1%, and 37.1% improvement in area, throughput, power, and energy, respectively. The library of RTL and behavioral implementations will be open-sourced at https://cfaed.tu-dresden.de/pd-downloads.","PeriodicalId":125112,"journal":{"name":"2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129718656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}