J. Winderickx, An Braeken, Dave Singelée, R. Peeters, T. Vandenryt, R. Thoelen, N. Mentens
{"title":"Digital signatures and signcryption schemes on embedded devices: a trade-off between computation and storage","authors":"J. Winderickx, An Braeken, Dave Singelée, R. Peeters, T. Vandenryt, R. Thoelen, N. Mentens","doi":"10.1145/3203217.3206426","DOIUrl":"https://doi.org/10.1145/3203217.3206426","url":null,"abstract":"This paper targets the efficient implementation of digital signatures and signcryption schemes on typical internet-of-things (IoT) devices, i.e. embedded processors with constrained computation power and storage. Both signcryption schemes (providing digital signatures and encryption simultaneously) and digital signatures rely on computation-intensive public-key cryptography. When the number of signatures or encrypted messages the device needs to generate after deployment is limited, a trade-off can be made between performing the entire computation on the embedded device or moving part of the computation to a precomputation phase. The latter results in the storage of the precomputed values in the memory of the processor. We examine this trade-off on a health sensor platform and we additionally apply storage encryption, resulting in five implementation variants of the considered schemes.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"164 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121064672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the theory of speculative checkpointing: time and energy considerations","authors":"Omer Subasi, S. Krishnamoorthy","doi":"10.1145/3203217.3203232","DOIUrl":"https://doi.org/10.1145/3203217.3203232","url":null,"abstract":"Collective checkpoint/rollback is the most popular approach for dealing with fail-stop errors on high-performance computing platforms. Prior work has focused on choosing checkpoint intervals that minimize the total cost of checkpoint/rollback. This work introduces the notion of speculative checkpointing, where we probabilistically skip some checkpoints. The careful selection of checkpoints either to be taken or skipped has the potential to reduce the total checkpoint/rollback overhead. We mathematically formulate the overall checkpoint/rollback cost in the presence of speculation. We consider the choice of speculation as a fixed probability or a probability distribution. We formulate two criteria to be minimized: total execution time and approximate total energy. We derive the criteria for beneficial speculative checkpointing for exponential and arbitrary failure distributions. Furthermore, we analyze the joint optimization of energy and time to express the trade-offs mathematically. We validate the formulations and evaluate various scenarios using discrete-event simulation. Experimental evaluation validates the models and demonstrates that employing speculation and choosing to speculate by sampling a distribution derived from the failure distribution achieves the best performance.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"253 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122535667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Reghenzani, S. Formentin, G. Massari, W. Fornaciari
{"title":"A constrained extremum-seeking control for CPU thermal management","authors":"F. Reghenzani, S. Formentin, G. Massari, W. Fornaciari","doi":"10.1145/3203217.3204464","DOIUrl":"https://doi.org/10.1145/3203217.3204464","url":null,"abstract":"The increasing complexity of computing architectures is pushing for novel Dynamic Thermal Management (DTM) techniques. Accordingly, more accurate power and thermal models are required. In this work, we propose a thermal controller based on a constrained extremum-seeking algorithm, enabling resource allocation optimization under specific thermal constraints. This approach comes with many advantages. First, the controller does not require any model of the system, dropping the need for a complex and potentially imprecise estimation phase. Second, it allows the control of derived measurements. We show how this may positively impact on the CPU reliability.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130320752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Madroñal, Antoine Morvan, R. Lazcano, R. Salvador, K. Desnos, E. J. Martínez, C. Sanz
{"title":"Automatic instrumentation of dataflow applications using PAPI","authors":"D. Madroñal, Antoine Morvan, R. Lazcano, R. Salvador, K. Desnos, E. J. Martínez, C. Sanz","doi":"10.1145/3203217.3209886","DOIUrl":"https://doi.org/10.1145/3203217.3209886","url":null,"abstract":"The widening of the complexity-productivity gap witnessed in the last years is becoming unaffordable from the application development point of view. New design methods try to automate most designers tasks in order to bridge this gap. In addition, new Models of Computation (MoC), as those dataflow-based, ease the expression of parallelism within applications and lead to higher productivity. Rapid prototyping design tools offer fast estimations of the soundness of design choices. A key step when prototyping an application is to have representative performance indicators to estimate the validity of the design choices. Such indicators can be obtained using hardware information through the Performance API (PAPI). In this work, PAPI and a dataflow MoC are integrated within a Y-chart design flow. The implementation takes the form of a dedicated automatic code generation scheme within the Preesm tool. Preliminary results show that depending on the complexity of the application, the computation time overhead due to monitoring varies from being almost negligible to more than 50%. Also, on top of offering accurate hardware performance indicators, the extracted values can be combined to estimate power or energy consumption.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134086375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Similarity based classification of ADHD using singular value decomposition","authors":"Taban Eslami, F. Saeed","doi":"10.1145/3203217.3203239","DOIUrl":"https://doi.org/10.1145/3203217.3203239","url":null,"abstract":"Attention deficit hyperactivity disorder (ADHD) is one of the most common brain disorders among children. This disorder is considered as a big threat for public health and causes attention, focus and organizing difficulties for children and even adults. Since the cause of ADHD is not known yet, data mining algorithms are being used to help discover patterns which discriminate healthy from ADHD subjects. Numerous efforts are underway with the goal of developing classification tools for ADHD diagnosis based on functional and structural magnetic resonance imaging data of the brain. In this paper, we used Eros, which is a technique for computing similarity between two multivariate time series along with k-Nearest-Neighbor classifier, to classify healthy vs ADHD children. We designed a model selection scheme called J-Eros which is able to pick the optimum value of k for k-Nearest-Neighbor from the training data. We applied this technique to the public data provided by ADHD-200 Consortium competition and our results show that J-Eros is capable of discriminating healthy from ADHD children such that we outperformed the best results reported by ADHD-200 competition about 20 percent for two datasets. The implemented code is available as GPL license on GitHub portal of our lab at https://github.com/pcdslab/J-Eros.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114983545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data cube and cloud resources as platform for seamless geospatial computation","authors":"G. Pagani, L. Trani","doi":"10.1145/3203217.3205861","DOIUrl":"https://doi.org/10.1145/3203217.3205861","url":null,"abstract":"The data cube paradigm has been successfully applied in several domains of the geosciences. In this work we present the results of an analysis conducted in the context of the EU project EUDAT2020. We combined the data cube concept with cloud resources in a specific use case: the computation of the sky view factor of The Netherlands. We report our experience moving from a \"traditional\" - file-based - approach to a data cube approach to manage and analyse large geospatial datasets. We provide an empirical analysis of the results from a user perspective, discuss possible applications and recommendations.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126166773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cheng Qian, Libo Huang, Qi Yu, Zhiying Wang, B. Childers
{"title":"CMH: compression management for improving capacity in the hybrid memory cube","authors":"Cheng Qian, Libo Huang, Qi Yu, Zhiying Wang, B. Childers","doi":"10.1145/3203217.3203235","DOIUrl":"https://doi.org/10.1145/3203217.3203235","url":null,"abstract":"The Hybrid Memory Cube (HMC) is a novel 3D memory architecture that efficiently improves bandwidth and saves energy. However, due to limitations in scalability and power density of a DRAM bit cell, the physical data capacity of an individual HMC is relatively modest and unlikely to grow significantly and it is likely to be a challenge in adopting the HMC for big data in high-performance computing. In this paper, we propose a new strategy to increase the effective data capacity of the HMC, called Compression Management for HMC (CMH). CMH is incorporated in the logic layer of the HMC. By selectively compressing data during transmission and storing the selectively compressed data in the 3D memory stack, CMH increases data capacity while also improving effective bandwidth. For several memory-intensive benchmarks, our results show that CMH reduces pressure on memory capacity by 64.4%, and improves bandwidth by 42.4%. Similarly good results are observed for multi-programmed workloads, reducing capacity 66.2% and improving bandwidth 47.8%. Although compression has latency overhead, by introducing a small cache in the HMC logic layer to store metadata for compression, CMH mitigates any increase in transaction latency. The overhead in instructions per cycle is a minimal 1.2% and 1.5%, respectively, for single-core and multi-core workloads. The IPC is stable and is not harmed by the inclusion of compression.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128609811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Network-on-chip evaluation for a novel neural architecture","authors":"Markos Kynigos, J. Navaridas, L. Plana, S. Furber","doi":"10.1145/3203217.3203268","DOIUrl":"https://doi.org/10.1145/3203217.3203268","url":null,"abstract":"This paper provides a performance evaluation and trade-off analysis of a novel chip architecture for neuromorphic computing, especially focused on the memory subsystems and the Network-On-Chip (NoC). More precisely, we study the performance-related effect of the number of memory modules, as well as that of allowing direct core-to-core communication. Our simulation-based experimental work throws many interesting results on the above aspects and allows to ensure that congestion at the NoC-level is unlikely to degrade performance.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114108004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GraphBLAS: handling performance concerns in large graph analytics","authors":"Manoj Kumar, J. Moreira, P. Pattnaik","doi":"10.1145/3203217.3205342","DOIUrl":"https://doi.org/10.1145/3203217.3205342","url":null,"abstract":"Emerging applications in health-care, social media analytics, cyber-security, homeland security, and marketing require large graph analytics. Attaining good performance on these applications on modern day hardware is challenging because of the complex pipelines and deep memory hierarchy of these machines. In this paper, we review the linear algebra formulation of graph-analytics and show that it effectively handles the separation of performance concerns, best handled by system developers, from application logic concerns. The linear algebra formulation leverages the community experience in optimizing both hardware and software for applications that have a substantial linear algebra component. We review the GraphBLAS API, a compact C API for linear algebra formulation of graph algorithms. The core semiring operations are described first, followed by the rest of the API. We then illustrate how commonly used graph algorithms are implemented using the main GraphBLAS API calls. Executing these algorithms on a highly optimized linear algebra run-time validates that the time spent in execution of the algorithm is indeed almost entirely in the library, thus delegating the performance concerns solely to the library developer. Furthermore, the linear algebra formulation consistently outperforms the textbook version of these algorithms by a factor of two to five. Vector and matrix multiplications consume the majority of the computational time, particularly as problem size increases, putting them in the cross hairs for performance optimization.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133739406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Mehrabian, Shuai Sun, Vikram K. Narayana, Jeff Anderson, Jiaxin Peng, V. Sorger, T. El-Ghazawi
{"title":"D3NoC: a dynamic data-driven hybrid photonic plasmonic NoC","authors":"A. Mehrabian, Shuai Sun, Vikram K. Narayana, Jeff Anderson, Jiaxin Peng, V. Sorger, T. El-Ghazawi","doi":"10.1145/3203217.3203272","DOIUrl":"https://doi.org/10.1145/3203217.3203272","url":null,"abstract":"It was previously shown that Hybrid Photonic Plasmonic Interconnect (HyPPI) is an efficient candidate for augmenting electronic network on chips (NoCs). Here we introduce a reconfigurable Hybrid Photonic Plasmonic NoC termed D3NOC, which intelligently augments electrical meshes with a hybrid photon-plasmon interconnect express bus. The intelligence uses the Dynamic Data Driven Application System (DDDAS) paradigm, where computations and measurements form a dynamic closed feedback loop. Our results show up to 67% latency improvements and 69% dynamic power net improvements beyond overhead-corrected performance compared to a 16 × 16 base electrical mesh.","PeriodicalId":127096,"journal":{"name":"Proceedings of the 15th ACM International Conference on Computing Frontiers","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130344273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}