{"title":"PAIS: Parallelization aware instruction scheduling for improving soft-error reliability of GPU-based systems","authors":"Haeseung Lee, Hsinchung Chen, M. A. Faruque","doi":"10.3850/9783981537079_0869","DOIUrl":"https://doi.org/10.3850/9783981537079_0869","url":null,"abstract":"For decades the semiconductor industry has been driven by Moore's Law and performed aggressive technology scaling to achieve low-power and high-performance. Meanwhile, the semiconductor industry has faced severe reliability challenges like soft-error. Many methodologies (such as redundancy methodologies) have been proposed to improve the soft-error reliability of GPU based systems. However, the GPU compiler has yet to be considered for improving the soft-error reliability of the GPU. In this paper, we propose a novel GPU architecture-aware compilation methodology to further improve the soft-error reliability. The proposed methodology jointly considers the parallel behavior of the GPU and the applications and minimizes the vulnerability of the GPU applications during instruction scheduling. The experimental results show that our methodology is able to perform the scheduling within 5.88 seconds on average and achieves soft-error reliability improvement up to 40% compared to the state-of-the-art compilation techniques. The results show that the performance and power overheads of our methodology are less than 10% in most of the cases.","PeriodicalId":311352,"journal":{"name":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121787686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Security in industrie 4.0 - challenges and solutions for the fourth industrial revolution","authors":"M. Waidner, M. Kasper","doi":"10.3850/9783981537079_1005","DOIUrl":"https://doi.org/10.3850/9783981537079_1005","url":null,"abstract":"Information technology (IT) is one of the most important drivers of innovation in production and automation. In Germany, the term Industrie 4.0 summarizes various activities and developments involved in the evolution of industrial processes in production, logistics, automation, etc. Many research and development projects work on different aspects of these developments. In the view of politics, industry, and IT enterprises, sufficient IT security is considered an essential prerequisite for the future of production. Although many current IT security solutions can be applied in Industrie 4.0 context, they do not satisfy requirements of processes in Industrie 4.0. Work needs to be done on underlying security mechanisms as well as on security architectures.","PeriodicalId":311352,"journal":{"name":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121871808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Precision timed industrial automation systems","authors":"Matthew M. Y. Kuo, Sidharta Andalam, P. Roop","doi":"10.3850/9783981537079_0186","DOIUrl":"https://doi.org/10.3850/9783981537079_0186","url":null,"abstract":"For Programmable Logic Controllers (PLCs) that implement safety-critical industrial automation systems, timing correctness is as important as its functional correctness. Modern PLCs employ run-time environments and/or general purpose processors designed by ARM, Intel and Freescale to implement real-time systems. However, general purpose processors are designed to improve the average case performance and ignore the worst case performance. This makes it nearly impossible to guarantee the timing correctness of safety-critical applications. In this paper, we apply the recently developed PRET philosophy to propose Precision Timed Industrial Automation (PTIA) Systems for the design of precision timed industrial automation systems.","PeriodicalId":311352,"journal":{"name":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122457394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Golanbari, S. Kiamehr, Mojtaba Ebrahimi, M. Tahoori
{"title":"Variation-aware near threshold circuit synthesis","authors":"M. Golanbari, S. Kiamehr, Mojtaba Ebrahimi, M. Tahoori","doi":"10.3850/9783981537079_0478","DOIUrl":"https://doi.org/10.3850/9783981537079_0478","url":null,"abstract":"Near-Threshold Computing (NTC) is shown to be a promising approach for improving the energy efficiency of VLSI circuits. Nevertheless, by reducing the supply voltage the delay impact of process variation significantly increases, leading to up to 20× performance variation compared to the nominal voltage. As a result, it is wasteful of energy and performance to deal with such variation by increasing the timing margins, which is common in nominal voltage. Therefore, considering the impact of process variation during the near-threshold circuit design phase is of decisive importance. In this paper, we propose a variation-aware synthesis flow for NTC to address this problem. The objective is to improve the performance and energy efficiency of a circuit during design time by considering statistical variation information. This is done by providing variation information to the synthesis tool, evaluating the performance of the synthesized circuit by Statistical Static Timing Analysis (SSTA), and adjusting the timing constraints accordingly in an iterative manner. Simulation results for a set of benchmark circuits show that our proposed flow reduces the variation by 86.6% and improves the performance and energy by 24.9% and 7.4%, respectively, at the expense of 4.8% area overhead.","PeriodicalId":311352,"journal":{"name":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"244 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122720986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Wille, Oliver Keszöcze, S. Hillmich, Marcel Walter, A. Ortiz
{"title":"Synthesis of approximate coders for on-chip interconnects using reversible logic","authors":"R. Wille, Oliver Keszöcze, S. Hillmich, Marcel Walter, A. Ortiz","doi":"10.3850/9783981537079_0287","DOIUrl":"https://doi.org/10.3850/9783981537079_0287","url":null,"abstract":"On-chip coding provides a remarkable potential to improve the energy efficiency of on-chip interconnects. However, the logic design of the encoder/decoder faces a main challenge: the area and power overhead should be minimal while, at the same time, decodability has to be guaranteed. To address these problems, we propose the concept of approximate coding, where the coding function is partially specified and the synthesis algorithm has a higher flexibility to simplify the circuit. Since conventional synthesis methods are unsuitable here, we propose an alternative synthesis approach based on reversible logic. Experimental evaluations confirm the benefits of both, the proposed concept of approximate codings as well as the proposed design method.","PeriodicalId":311352,"journal":{"name":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122815981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving scalability of CMPs with dense ACCs coverage","authors":"N. Teimouri, H. Tabkhi, G. Schirner","doi":"10.3850/9783981537079_0527","DOIUrl":"https://doi.org/10.3850/9783981537079_0527","url":null,"abstract":"Utilizing Hardware Accelerators (ACCs) is a promising solution to improve performance and power efficiency of Chip Multi-Processors (CMPs). However, new challenges arise with the trend of shifting from few ACCs (with sparse ACCs coverage) to many ACCs (denser ACCs coverage) on a chip. The primary challenges are a lack of clear semantics in ACC communication as well as a processor-centric view for orchestrating the entire system. This paper opens a path toward efficient integration of many ACCs on a single chip. To this end, the paper at first identifies 4 major semantic aspects when two ACCs communicate with each other: data access model, data granularity, marshalling, and synchronization. Based on the identified semantics, the paper then proposes an efficient architecture solution, Transparent Self-Synchronizing (TSS), to realize the identified semantics in the underlying architecture. In principle, TSS proposes a shift from the current processor-centric view to a more equal, peer view between ACCs and the host processors. TSS minimizes the interaction with the host processor and reduces the volume of ACC-to-ACC communication traffic exposed to the system fabric. Our results using 8 streaming applications with a varying ACC coverage density demonstrate significant benefits of TSS, including a 3x speedup over the current ACC-based architectures.","PeriodicalId":311352,"journal":{"name":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130014145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A discrete thermal controller for chip-multiprocessors","authors":"Yingnan Cui, Wei Zhang, Bingsheng He","doi":"10.3850/9783981537079_0656","DOIUrl":"https://doi.org/10.3850/9783981537079_0656","url":null,"abstract":"As the power density of modern processors keeps increasing, thermal management remains a challenging problem for processor designers. Among various solutions, closed-loop automatic thermal controllers have the benefits of fast response speed and high control accuracy. However, as a processor is a discrete system by nature, controllers designed by classic control theories fail to consider the system features related to the discreteness and thus cannot achieve optimal result. In this work, we propose a discrete thermal controller with the form of the digital filter with special concern about the frequency field response affected by the sampling process. We optimize the sampling period and the response time of the controller. Experimental results show up to 50% sampling frequency reduction and up to 25% improvement in the performance of CMP systems with thermal constraints when compared to other state-of-the-art closed-loop thermal controllers.","PeriodicalId":311352,"journal":{"name":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129320488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Stühring, Günter Ehmen, Sibylle B. Fröschle
{"title":"Analyzing the impact of injected sensor data on an Advanced Driver Assistance System using the OP2TIMUS prototyping platform","authors":"Alexander Stühring, Günter Ehmen, Sibylle B. Fröschle","doi":"10.3850/9783981537079_0361","DOIUrl":"https://doi.org/10.3850/9783981537079_0361","url":null,"abstract":"Modern vehicles are running complex and safety critical applications distributed over several Electronic Control Units (ECUs). Some ECUs are equipped with communication interfaces providing access to other devices, networks or remote services. Since the number of attack vectors is increasing, an early investigation of the impact of attacks becomes steadily more important. This paper gives an example how manipulated sensor data injected to the CAN bus affects an Advanced Driver Assistance System (ADAS). Within multiple experiments we illustrate the impact of different aspects like the sending rate.","PeriodicalId":311352,"journal":{"name":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129644247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Ascoli, R. Tetzlaff, L. Chua, J. Strachan, R. S. Williams
{"title":"Fading memory effects in a memristor for Cellular Nanoscale Network applications","authors":"A. Ascoli, R. Tetzlaff, L. Chua, J. Strachan, R. S. Williams","doi":"10.3850/9783981537079_0977","DOIUrl":"https://doi.org/10.3850/9783981537079_0977","url":null,"abstract":"CNN based analogic cellular computing is a unified paradigm for universal spatio-temporal computation with several applications in a large number of different fields of research. By endowing CNN with local memory, control, and communication circuitry, many different hardware architectures with stored programmability, showing an enormous computing power - trillion of operations per second may be executed on a single chip -, have been realized. The complex spatio-temporal dynamics emerging in certain CNN may lead to the development of more efficient information processing methods as compared to conventional strategies. Memristors exhibit a rich variety of nonlinear behaviours, occupy a negligible amount of integrated circuit area, consume very little power, are suited to a massively-parallel data flow, and may combine data storage with signal processing. As a result, the use of memristors in future CNN-based computing structures may improve and/or extend the functionalities of state-of-the art hardware architectures. This contribution provides a detailed analysis of the system-theoretic model of a tantalum oxide memristor, in view of its potential adoption for the implementation of synaptic operators in CNN architectures.","PeriodicalId":311352,"journal":{"name":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"208-209 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128548061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Taeyoung Kim, Xin Huang, Hai-Bao Chen, V. Sukharev, S. Tan
{"title":"Learning-based dynamic reliability management for dark silicon processor considering EM effects","authors":"Taeyoung Kim, Xin Huang, Hai-Bao Chen, V. Sukharev, S. Tan","doi":"10.3850/9783981537079_0441","DOIUrl":"https://doi.org/10.3850/9783981537079_0441","url":null,"abstract":"In this article, we propose a new dynamic reliability management (DRM) technique for emerging dark silicon manycore processors. We formulate our DRM problem as minimizing the energy consumption subject to the reliability, performance and thermal constraints. The new approach is based on a newly proposed physics-based electromigration (EM) reliability model to predict the EM reliability of full-chip power grid networks. We consider thermal design power (TDP) as the power constraint for a dark silicon manycore processor. We employ both dynamic voltage and frequency scaling (DVFS) and dark silicon core using ON/OFF pulsing action as the two control knobs. To solve the problem, we apply the adaptive Q-learning based method, which is suitable for runtime operation as it can provide cost-effective yet good solutions. A large class of multithreaded applications is used as the benchmark to validate and compare the proposed dynamic reliability management methods. Experimental results on a 64-core dark silicon chip show that the proposed DRM algorithm can effectively reduce the energy consumption of a dark silicon manycore system when the system is not tightly constrained. The proposed method can outperform a simple global DVFS method significantly in this case.","PeriodicalId":311352,"journal":{"name":"2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122396162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}