Leonidas Kosmidis, J. Lachaize, J. Abella, O. Notebaert, F. Cazorla, D. Steenari
{"title":"GPU4S: Embedded GPUs in Space","authors":"Leonidas Kosmidis, J. Lachaize, J. Abella, O. Notebaert, F. Cazorla, D. Steenari","doi":"10.1109/DSD.2019.00064","DOIUrl":"https://doi.org/10.1109/DSD.2019.00064","url":null,"abstract":"Following the same trend of automotive and avionics, the space domain is witnessing an increase in the on-board computing performance demands. This raise in performance needs comes from both control and payload parts of the spacecraft and calls for advanced electronics able to provide high computational power under the constraints of the harsh space environment. On the non-technical side, for strategic reasons it is mandatory to get European independence on the used computing technology. In this project, which is still in its early phases, we study the applicability of embedded GPUs in space, which have shown a dramatic improvement of their performance per-watt ratio coming from their proliferation in consumer markets based on competitive European technology. To that end, we perform an analysis of the existing space application domains to identify which software domains can benefit from their use. Moreover, we survey the embedded GPU domain in order to assess whether embedded GPUs can provide the required computational power and identify the challenges which need to be addressed for their adoption in space. In this paper, we describe the steps to be followed in the project, as well as the results of our preliminary analyses in the first months of the project.","PeriodicalId":217233,"journal":{"name":"2019 22nd Euromicro Conference on Digital System Design (DSD)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131362747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Druml, O. Veledar, Georg Macher, G. Stettinger, Selim Solmaz, Jakob Reckenzaun, S. Diaz, M. Marcano, J. Villagrá, R. Beekelaar, Johannes Jany-Luig, Marta Maria Corredoira, P. Burgio, Christian Ballato, B. Debaillie, Lars van Meurs, A. Terechko, F. Tango, A. Ryabokon, A. Anghel, O. Icoglu, Sumeet S. Kumar, G. Dimitrakopoulos
{"title":"PRYSTINE - Technical Progress After Year 1","authors":"N. Druml, O. Veledar, Georg Macher, G. Stettinger, Selim Solmaz, Jakob Reckenzaun, S. Diaz, M. Marcano, J. Villagrá, R. Beekelaar, Johannes Jany-Luig, Marta Maria Corredoira, P. Burgio, Christian Ballato, B. Debaillie, Lars van Meurs, A. Terechko, F. Tango, A. Ryabokon, A. Anghel, O. Icoglu, Sumeet S. Kumar, G. Dimitrakopoulos","doi":"10.1109/DSD.2019.00063","DOIUrl":"https://doi.org/10.1109/DSD.2019.00063","url":null,"abstract":"Among the actual trends that will affect society in the coming years, autonomous driving stands out as having the potential to disruptively change the automotive industry as we know it today. For this, fail-operational behavior is essential in the sense, plan, and act stages of the automation chain in order to handle safety-critical situations by its own, which currently is not reached with state-of-the-art approaches also due to missing reliable environment perception and sensor fusion. PRYSTINE will realize Fail-operational Urban Surround perceptION (FUSION) which is based on robust Radar and LiDAR sensor fusion and control functions in order to enable safe automated driving in urban and rural environments. In this paper, we detail the vision of the PRYSTINE project and we showcase the results achieved during the first year.","PeriodicalId":217233,"journal":{"name":"2019 22nd Euromicro Conference on Digital System Design (DSD)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133808983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benjamin Dauphin, R. Pacalet, Andrea Enrici, L. Apvrille
{"title":"Odyn: Deadlock Prevention and Hybrid Scheduling Algorithm for Real-Time Dataflow Applications","authors":"Benjamin Dauphin, R. Pacalet, Andrea Enrici, L. Apvrille","doi":"10.1109/DSD.2019.00023","DOIUrl":"https://doi.org/10.1109/DSD.2019.00023","url":null,"abstract":"In recent wireless communication standards (4G, 5G), the growing need for dynamic adjustments of transmission parameters (e.g., modulation, bandwidth, channel coding rate) makes traditional static scheduling approaches less and less efficient. The reason being that precomputed fixed mapping and scheduling prevent the system from dynamically adapting to changes of the operating conditions (e.g. wireless channel quality, available bandwidth). In this paper, we present Odyn, a hybrid approach for the scheduling and memory management of periodic dataflow applications on parallel, heterogeneous, Non-Uniform Memory Architecture (NUMA) platforms. In Odyn, the ordering of tasks and memory allocation are distributed and computed simultaneously at run-time for each Processing Element. Odyn also proposes a mechanism to prevent deadlocks caused by attempts to allocate buffers in size-limited memories. This technique, based on the static computation of exclusion relations among buffers in a target application, removes the need for backtracking that is typical of dynamic scheduling algorithms. We demonstrate the effectiveness of Odyn on a testbench that simulates the interactions of randomly generated concurrent applications. We also demonstrate its deadlock prevention technique on a selection of use cases.","PeriodicalId":217233,"journal":{"name":"2019 22nd Euromicro Conference on Digital System Design (DSD)","volume":"96 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133984990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and Implementation of a Fast and Scalable NTT-Based Polynomial Multiplier Architecture","authors":"A. Mert, Erdinç Öztürk, E. Savaş","doi":"10.1109/DSD.2019.00045","DOIUrl":"https://doi.org/10.1109/DSD.2019.00045","url":null,"abstract":"In this paper, we present an optimized FPGA implementation of a novel, fast and highly parallelized NTT-based polynomial multiplier architecture, which proves to be effective as an accelerator for lattice-based homomorphic cryptographic schemes. As I/O operations are as time-consuming as NTT operations during homomorphic computations in a host processor/accelerator setting, instead of achieving the fastest NTT implementation possible on the target FPGA, we focus on a balanced time performance between the NTT and I/O operations. Even with this goal, we achieved the fastest NTT implementation in literature, to the best of our knowledge. For proof of concept, we utilize our architecture in a framework for Fan-Vercauteren (FV) homomorphic encryption scheme, utilizing a hardware/software co-design approach, in which polynomial multiplication operations are offloaded to the accelerator via PCIe bus while the rest of operations in the FV scheme are executed in software running on an off-the-shelf desktop computer. Specifically, our framework is optimized to accelerate Simple Encrypted Arithmetic Library (SEAL), developed by the Cryptography Research Group at Microsoft Research, for the FV encryption scheme, where large degree polynomial multiplications are utilized extensively. The hardware part of the proposed framework targets Xilinx Virtex-7 FPGA device and the proposed framework achieves almost 11x latency speedup for the offloaded operations compared to their pure software implementations. We achieved a throughput of almost 800K polynomial multiplications per second, for polynomials of degree 1024 with 32-bit coefficients.","PeriodicalId":217233,"journal":{"name":"2019 22nd Euromicro Conference on Digital System Design (DSD)","volume":"6 10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131901206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guillaume Dabosville, Houssem Maghrebi, A. Lhuillery, J. Bringer, Thanh-Ha Le
{"title":"On the Bright Side of Darkness: Side-Channel Based Authentication Protocol Against Relay Attacks","authors":"Guillaume Dabosville, Houssem Maghrebi, A. Lhuillery, J. Bringer, Thanh-Ha Le","doi":"10.1109/DSD.2019.00040","DOIUrl":"https://doi.org/10.1109/DSD.2019.00040","url":null,"abstract":"Relay attacks are nowadays well known and most designers of secure authentication protocols are aware of them. At present, the main methods to prevent these attacks are based on the so-called distance bounding technique which consists in measuring the round-trip time of the exchanged authentication messages between the prover and the verifier to estimate an upper bound on the distance between these entities. Based on this bound, the verifier checks if the prover is sufficiently close by to rule out an unauthorized entity. Recently, a new work has proposed an authentication protocol that surprisingly uses the side-channel leakage to prevent relay attacks. In this paper, we exhibit some practical and security issues of this protocol and provide a new one that fixes all of them. Then, we argue the resistance of our proposal against both side-channel and relay attacks under some realistic assumptions. Our experimental results show the efficiency of our protocol in terms of false acceptance and false rejection rates.","PeriodicalId":217233,"journal":{"name":"2019 22nd Euromicro Conference on Digital System Design (DSD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130834292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eirini Liotou, D. Tsolkas, S. Tennina, L. Pomante, Giorgos Kalpaktsoglou, N. Passas
{"title":"The CASPER Project Approach Towards User-Centric Mobile Networks","authors":"Eirini Liotou, D. Tsolkas, S. Tennina, L. Pomante, Giorgos Kalpaktsoglou, N. Passas","doi":"10.1109/DSD.2019.00065","DOIUrl":"https://doi.org/10.1109/DSD.2019.00065","url":null,"abstract":"This paper presents an overview of the project CASPER, a Marie Curie Research and Innovation Staff Exchange (RISE) project running from 2016 until 2020. The main objective of CASPER is to combine academic and industrial forces towards leveraging the expected benefits of Quality of Experience (QoE) exploitation in future digital networks. In particular, CASPER exploits the most recent advances in communication networks, such as the Software Defined Networking (SDN) and the Network Functions Virtualisation (NFV) in order to design and implement a middleware architecture for QoE-driven service provisioning. CASPER, therefore, addresses the challenges that mobile network operators and service providers face in the current fully digitalized era of communications, concerning the need to provide customer-oriented, reliable, efficient, flexible and dynamic service management.","PeriodicalId":217233,"journal":{"name":"2019 22nd Euromicro Conference on Digital System Design (DSD)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134222414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Caio Hoffman, C. Gebotys, Diego F. Aranha, M. Côrtes, G. Araújo
{"title":"Circumventing Uniqueness of XOR Arbiter PUFs","authors":"Caio Hoffman, C. Gebotys, Diego F. Aranha, M. Côrtes, G. Araújo","doi":"10.1109/DSD.2019.00041","DOIUrl":"https://doi.org/10.1109/DSD.2019.00041","url":null,"abstract":"A fundamental property of Physical Unclonable Functions (PUFs) is uniqueness, which results from the intrinsic characteristics of each PUF instance. However, PUF architectures employ elements whose physical characteristics and behavior may be very similar among different instances, thus leaking unwanted information. We explore the consequences of this effect by mounting Template Attacks over XOR Arbiter PUFs. In the attack, Challenge-Respose Pairs (CRPs) are profiled in one FPGA instance of the PUF to predict responses of a different FPGA instance, obtaining up to 80% of accuracy. We show that replicating the same attack strategy with a well-known Machine Learning (ML) algorithm would not be as effective, since different PUFs instances will not share similar CRP sets. Our template attack only needs few CRPs for profiling (at most 170), but it can be applied to different instances without additional training, which Machile Learning cannot do with unbiased PUF instances.","PeriodicalId":217233,"journal":{"name":"2019 22nd Euromicro Conference on Digital System Design (DSD)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134449025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of Novel CMOS Based Inexact Subtractors and Dividers for Approximate Computing: An In-Depth Comparison with PTL Based Designs","authors":"C. Jha, Joycee Mekie","doi":"10.1109/DSD.2019.00034","DOIUrl":"https://doi.org/10.1109/DSD.2019.00034","url":null,"abstract":"Multimedia applications consume an immense amount of energy. These applications have division as one of the fundamental operations. Division is also one of the costliest operations in terms of energy consumption. Thus, various works have been done to address the issue of energy consumption in multimedia applications by using approximate dividers based on pass transistor logic (PTL). Since these applications have resilience towards erroneous computations huge energy benefits are obtained as a result of approximate computations with similar output quality. In this paper, we have shown that PTL based designs are not suitable for lower technology nodes. We performed an in-depth analysis using UMC 65nm and UMC 28nm to highlight the adverse effects of technology scaling on energy consumption and delay in PTL based design as compared to CMOS based designs. We also propose four different inexact CMOS subtractor (ICS) designs, as they are the basic repeated module in inexact restoring array dividers (IRADs). Our proposed ICS designs consume ~ 2× lesser dynamic energy, ~ 3× lesser static power and have ~ 2.5× lesser delay as compared to the existing PTL based designs in UMC 65nm. These benefits increase for UMC 28nm, which shows PTL based designs further worsens at lower technology nodes. IRADs also give about 50% reduction in energy consumption with only 3% degradation in Structural Similarity (SSIM) Index, an image quality metric in multimedia applications like change detection, background removal, and JPEG compression, as compared to exact restoring array divider (ERAD).","PeriodicalId":217233,"journal":{"name":"2019 22nd Euromicro Conference on Digital System Design (DSD)","volume":"319 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133835849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Study of Performance and Power Consumption Differences Among Different ISAs","authors":"Ayaz Akram, L. Sawalha","doi":"10.1109/DSD.2019.00098","DOIUrl":"https://doi.org/10.1109/DSD.2019.00098","url":null,"abstract":"Recent advances in different instruction set architectures (ISAs) and their implementations have revived the argument on the role of ISAs in the overall performance and energy efficiency of a processor. Many computer architects believe that with current compiler and microarchitecture developments, the choice of ISA is not a decisive matter anymore. On the other hand, some believe that ISAs can still play an important role in the overall performance and energy efficiency of a computer system. Our objective is to compare and contrast ISAs by finding the differences in performance and energy consumption across ISAs and the reasons behind those differences. Our work shows that ISAs affect the performance and energy efficiency of applications differently based on their inherent characteristics.","PeriodicalId":217233,"journal":{"name":"2019 22nd Euromicro Conference on Digital System Design (DSD)","volume":"5 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115733725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Lamprecht, Ananth Garikapati, Swaminathan Narayanaswamy, S. Steinhorst
{"title":"Enhancing Battery Pack Capacity Utilization in Electric Vehicle Fleets via SoC-Preconditioning","authors":"Alexander Lamprecht, Ananth Garikapati, Swaminathan Narayanaswamy, S. Steinhorst","doi":"10.1109/DSD.2019.00059","DOIUrl":"https://doi.org/10.1109/DSD.2019.00059","url":null,"abstract":"Modern public transport solutions based on autonomous electric vehicles are on the rise. Public transportation as a service on demand is becoming a reality. Therefore, vehicles suitable for these kinds of applications need to be developed. One critical factor for such vehicles is a short turnaround time at the charging spot. Maximizing the utilization of a given battery pack capacity and minimizing the time spent charging are therefore of central importance. In this paper, we propose a novel preconditioning algorithm to minimize the time an EV is connected to the charging station. Our proposed approach uses existing Active Cell Balancing (ACB) hardware of the battery pack to precondition the State of Charge (SoC) of cells such that all cells reach the top SoC threshold at the same time without requiring an additional balancing phase during charging. This is done by considering the individual cells' charging rate to precondition them for achieving an equal time to full charge. Applying the same approach for discharging, we also extend the driving range of an EV, which otherwise is limited by the cell with the lowest SoC in the pack. Case studies show that our proposed preconditioning algorithm increases the usable energy of the battery pack by up to 3% compared to conventional balancing algorithms all while effectively halving the time connected to a charging station, all without requiring any additional hardware components.","PeriodicalId":217233,"journal":{"name":"2019 22nd Euromicro Conference on Digital System Design (DSD)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120948890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}