Jianli Chen, Li Yang, Zheng Peng, Wen-xing Zhu, Yao-Wen Chang
{"title":"Novel Proximal Group ADMM for Placement Considering Fogging and Proximity Effects","authors":"Jianli Chen, Li Yang, Zheng Peng, Wen-xing Zhu, Yao-Wen Chang","doi":"10.1145/3240765.3240832","DOIUrl":"https://doi.org/10.1145/3240765.3240832","url":null,"abstract":"Fogging and proximity effects are two major factors that cause inaccurate exposure and thus layout pattern distortions in e-beam lithography. In this paper, we propose the first analytical placement algorithm to consider both the fogging and proximity effects. We first formulate the global placement problem as a separable minimization problem with linear constraints, where different objectives can be tackled one by one in an alternating fashion. Then, we propose a novel proximal group alternating direction method of multipliers (ADMM) to solve the separable minimization problem with two subproblems, where the first subproblem (mainly associated with wirelength and density) is solved by a steepest descent method without line-search, and the second one (mainly associated with the fogging and proximity effects) is handled by an analytical scheme. We prove the property of global convergence of the proximal group ADMM method. Finally, legalization and detailed placement are used to legal and further improve the placement result. Experimental results show that our algorithm is effective and efficient for the addressed problem. Compared with the state-of-the-art work, our algorithm not only can achieve 13.4% smaller fogging variation and 21.4% lower proximity variation, but also has a 1.65× speedup.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125248948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hardware-accelerated Data Acquisition and Authentication for High-speed Video Streams on Future Heterogeneous Automotive Processing Platforms","authors":"M. Geier, Fabian Franzen, S. Chakraborty","doi":"10.1145/3240765.3243478","DOIUrl":"https://doi.org/10.1145/3240765.3243478","url":null,"abstract":"With the increasing use of Ethernet-based communication backbones in safety-critical real-time domains, both efficient and predictable interfacing and cryptographically secure authentication of high-speed data streams are becoming very important. Although the increasing data rates of in-vehicle networks allow the integration of more demanding (e.g., camera-based) applications, processing speeds and, in particular, memory bandwidths are no longer scaling accordingly. The need for authentication, on the other hand, stems from the ongoing convergence of traditionally separated functional domains and the extended connectivity both in- (e.g., smart-phones) and outside (e.g., telemetry, cloud-based services and vehicle-to-X technologies) current vehicles. The inclusion of cryptographic measures thus requires careful interface design to meet throughput, latency, safety, security and power constraints given by the particular application domain. Over the last decades, this has forced system designers to not only optimize their software stacks accordingly, but also incrementally move interface functionalities from software to hardware. This paper discusses existing and emerging methods for dealing with high-speed data streams ranging from software-only via mixed-hardware/software approaches to fully hardware-based solutions. In particular, we introduce two approaches to acquire and authenticate GigE Vision Video Streams at full line rate of Gigabit Ethernet on Programmable SoCs suitable for future heterogeneous automotive processing platforms.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117352933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Privacy-Preserving Deep Learning and Inference","authors":"M. Riazi, F. Koushanfar","doi":"10.1145/3240765.3274560","DOIUrl":"https://doi.org/10.1145/3240765.3274560","url":null,"abstract":"We provide a systemization of knowledge of the recent progress made in addressing the crucial problem of deep learning on encrypted data. The problem is important due to the prevalence of deep learning models across various applications, and privacy concerns over the exposure of deep learning IP and user's data. Our focus is on provably secure methodologies that rely on cryptographic primitives and not trusted third parties/platforms. Computational intensity of the learning models, together with the complexity of realization of the cryptography algorithms hinder the practical implementation a challenge. We provide a summary of the state-of-the-art, comparison of the existing solutions, as well as future challenges and opportunities.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"383 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124769555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leonidas Kosmidis, C. Maxim, Victor Jégu, Francis Vatrinet, F. Cazorla
{"title":"Industrial Experiences with Resource Management under Software Randomization in ARINC653 Avionics Environments","authors":"Leonidas Kosmidis, C. Maxim, Victor Jégu, Francis Vatrinet, F. Cazorla","doi":"10.1145/3240765.3240818","DOIUrl":"https://doi.org/10.1145/3240765.3240818","url":null,"abstract":"Injecting randomization in different layers of the computing platform has been shown beneficial for security, resilience to software bugs and timing analysis. In this paper, with focus on the latter, we show our experience regarding memory and timing resource management when software randomization techniques are applied to one of the most stringent industrial environments, ARINC653-based avionics. We describe the challenges in this task, we propose a set of solutions and present the results obtained for two commercial avionics applications, executed on COTS hardware and RTOS.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"315 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129606382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sebastian Vogel, Mengyu Liang, A. Guntoro, W. Stechele, G. Ascheid
{"title":"Efficient Hardware Acceleration of CNNs using Logarithmic Data Representation with Arbitrary log-base","authors":"Sebastian Vogel, Mengyu Liang, A. Guntoro, W. Stechele, G. Ascheid","doi":"10.1145/3240765.3240803","DOIUrl":"https://doi.org/10.1145/3240765.3240803","url":null,"abstract":"Efficient acceleration of Deep Neural Networks is a manifold task. In order to save memory requirements and reduce energy consumption we propose the use of dedicated accelerators with novel arithmetic processing elements which use bit shifts instead of multipliers. While a regular power-of-2 quantization scheme allows for multiplierless computation of multiply-accumulate-operations, it suffers from high accuracy losses in neural networks. Therefore, we evaluate the use of powers-of-arbitrary-log-bases and confirmed their suitability for quantization of pre-trained neural networks. The presented method works without retraining of the neural network and therefore is suitable for applications in which no labeled training data is available. In order to verify our proposed method, we implement the log-based processing elements into a neural network accelerator on an FPGA. The hardware efficiency is evaluated in terms of FPGA utilization and energy requirements in comparison to regular 8-bit-fixed-point multiplier based acceleration. Using this approach hardware resources are minimized and power consumption is reduced by 22.3%.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127883035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Logic Synthesis of Binarized Neural Networks for Efficient Circuit Implementation","authors":"Chia-Chih Chi, J. H. Jiang","doi":"10.1145/3240765.3240822","DOIUrl":"https://doi.org/10.1145/3240765.3240822","url":null,"abstract":"Neural networks (NNs) are key to deep learning systems. Their efficient hardware implementation is crucial to applications at the edge. Binarized NNs (BNNs), where the weights and output of a neuron are of binary values {–1, +1} (or encoded in {0, 1}), have been proposed recently. As no multiplier is required, they are particularly attractive and suitable for hardware realization. Most prior NN synthesis methods target on hardware architectures with neural processing elements (NPEs), where the weights of a neuron are loaded and the output of the neuron is computed. The load-and-compute method, though area efficient, requires expensive memory access, which deteriorates energy and performance efficiency. In this work we aim at synthesizing BNN dense layers into dedicated logic circuits. We formulate the corresponding matrix covering problem and propose a scalable algorithm to reduce the area and routing cost of BNNs. Experimental results justify the effectiveness of the method in terms of area and net savings on FPGA implementation. Our method provides an alternative implementation of BNNs, and can be applied in combination with NPE-based implementation for area, speed, and power tradeoffs.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127914972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yun Long, Taesik Na, Prakshi Rastogi, Karthik Rao, A. Khan, S. Yalamanchili, S. Mukhopadhyay
{"title":"A Ferroelectric FET based Power-efficient Architecture for Data-intensive Computing","authors":"Yun Long, Taesik Na, Prakshi Rastogi, Karthik Rao, A. Khan, S. Yalamanchili, S. Mukhopadhyay","doi":"10.1145/3240765.3240770","DOIUrl":"https://doi.org/10.1145/3240765.3240770","url":null,"abstract":"In this paper, we present a ferroelectric FET (FeFET) based power-efficient architecture to accelerate data-intensive applications such as deep neural networks (DNNs). We propose a cross-cutting solution combining emerging device technologies, circuit optimizations, and micro-architecture innovations. At device level, FeFET crossbar is utilized to perform vector-matrix multiplication (VMM). As a field effect device, FeFET significantly reduces the read/write energy compared with the resistive random-access memory (ReRAM). At circuit level, we propose an all-digital peripheral design, reducing the large overhead introduced by ADC and DAC in prior works. In terms of micro-architecture innovation, a dedicated hierarchical network-on-chip (H-NoC) is developed for input broadcasting and on-the-fly partial results processing, reducing the data transmission volume and latency. Speed, power, area and computing accuracy are evaluated based on detailed device characterization and system modeling. For DNN computing, our design achieves 254x and 9.7x gain in power efficiency (GOPS/W) compared to GPU and ReRAM based designs, respectively.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116681294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Multithreaded Initial Detailed Routing Algorithm Considering Global Routing Guides","authors":"Fan-Keng Sun, Hao Chen, Ching-Yu Chen, Chen-Hao Hsu, Yao-Wen Chang","doi":"10.1145/3240765.3240777","DOIUrl":"https://doi.org/10.1145/3240765.3240777","url":null,"abstract":"Detailed routing is the most complicated and time-consuming stage in VLSI design and has become a critical process for advanced node enablement. To handle the high complexity of modern detailed routing, initial detailed routing is often employed to minimize design-rule violations to facilitate final detailed routing, even though it is still not violation-free after initial routing. This paper presents a novel initial detailed routing algorithm to consider industrial design-rule constraints and optimize the total wirelength and via count. Our algorithm consists of three major stages: (1) an effective pin-access point generation method to identify valid points to model a complex pin shape, (2) a via-aware track assignment method to minimize the overlaps between assigned wire segments, and (3) a detailed routing algorithm with a novel negotiation-based rip-up and re-route scheme that enables multithreading and honors global routing information while minimizing design-rule violations. Experimental results show that our router outperforms all the winning teams of the 2018 ACM ISPD Initial Detailed Routing Contest, where the top-3 routers result in 23%, 52%, and 1224% higher costs than ours.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117208158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Architecting Data Placement in SSDs for Efficient Secure Deletion Implementation","authors":"Hoda Aghaei Khouzani, Chen Liu, Chengmo Yang","doi":"10.1145/3240765.3240780","DOIUrl":"https://doi.org/10.1145/3240765.3240780","url":null,"abstract":"Secure deletion ensures user privacy by permanently removing invalid data from the secondary storage. This process is particularly critical to solid state drives (SSDs) wherein invalid data are generated not only upon deleting a file but also upon updating a file of which the user is not aware. While previous secure deletion schemes are usually applied to all invalid data on the SSD, our observation is that in many cases security is not required for all files on the SSD. This paper proposes an efficient secure deletion scheme targeting only the invalid data of files marked as “secure” by the user. A security-aware data allocation strategy is designed, which separates secure and unsecure data at lower (block) level but mixes them at higher levels of SSD hierarchical organization. Block-level separation minimizes secure deletion cost, while higher-level mixing mitigates the adverse impact of secure deletion on SSD lifetime. A two-level block management scheme is further developed to scatter secure blocks over the SSD for wear leveling. Experiments on real-world benchmarks confirm the advantage of the proposed scheme in reducing secure deletion cost and improving SSD lifetime.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121261346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extending ML-OARSMT to Net Open Locator with Efficient and Effective Boolean Operations","authors":"B. Jiang, Hung-Ming Chen","doi":"10.1145/3240765.3240807","DOIUrl":"https://doi.org/10.1145/3240765.3240807","url":null,"abstract":"Multi-layer obstacle-avoiding rectilinear Steiner minimal tree (ML-OARSMT) problem has been extensively studied in recent years. In this work, we consider a variant of ML-OARSMT problem and extend the applicability to the net open location finder. Since ECO or router limitations may cause the open nets, we come up with a framework to detect and reconnect existing nets to resolve the net opens. Different from prior connection graph based approach, we propose a technique by applying efficient Boolean operations to repair net opens. Our method has good quality and scalability and is highly parallelizable. Compared with the results of ICCAD-2017 contest, we show that our proposed algorithm can achieve the smallest cost with 4.81 speedup in average than the top-3 winners.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127384293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}