2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)最新文献

筛选
英文 中文
A Machine Learning Approach for Parameter Screening in Earthquake Simulation 地震模拟中参数筛选的机器学习方法
Marisol Monterrubio Velasco, J. C. Carrasco-Jiménez, Octavio Castillo Reyes, F. Cucchietti, J. Puente
{"title":"A Machine Learning Approach for Parameter Screening in Earthquake Simulation","authors":"Marisol Monterrubio Velasco, J. C. Carrasco-Jiménez, Octavio Castillo Reyes, F. Cucchietti, J. Puente","doi":"10.1109/CAHPC.2018.8645865","DOIUrl":"https://doi.org/10.1109/CAHPC.2018.8645865","url":null,"abstract":"Earthquakes are the result of rupture in the Earth's crust. The rupture process is difficult to model deterministically due to the number of unmeasurable parameters involved and poorly constrained physical conditions, as well as the very diverse scales involved in their nucleation (meters) and complete failure (up to hundreds of kilometers). In this research work we focus on synthetic seismic catalogs generated with a stochastic modeling technique called Fiber Bundle Model (FBM). These catalogs can be readily compared with statistical measures computed from real earthquake series, but the link between the FBM parameters and the characteristics of the obtained earthquake series is difficult to assess. Furthermore, the stochastic nature of the model requires a large amount of realizations in order to attain statistical robustness. The aim of this work is to estimate the FBM parameters that generate aftershock sequences that are similar to those generated by real seismic events. In order to estimate the optimal combination of parameters that generate such sequences, we executed a large number of simulations with different combinations of parameters using High-Performance Computing (HPC) resources to reduce compute time. Lastly, the synthetic datasets were used to train a supervised Machine Learning (ML) model that analyzes and extracts statistical patterns that reproduce the observations regarding aftershock occurrence and its spatio-temporal distribution in real seismic events.","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123583720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Online Detection of Spectre Attacks Using Microarchitectural Traces from Performance Counters 利用性能计数器的微架构跟踪在线检测幽灵攻击
Congmiao Li, J. Gaudiot
{"title":"Online Detection of Spectre Attacks Using Microarchitectural Traces from Performance Counters","authors":"Congmiao Li, J. Gaudiot","doi":"10.1109/CAHPC.2018.8645918","DOIUrl":"https://doi.org/10.1109/CAHPC.2018.8645918","url":null,"abstract":"To improve processor performance, computer architects have adopted such acceleration techniques as speculative execution and caching. However, researchers have recently discovered that this approach implies inherent security flaws, as exploited by Meltdown and Spectre. Attacks targeting these vulnerabilities can leak protected data through side channels such as data cache timing by exploiting mis-speculated executions. The flaws can be catastrophic because they are fundamental and widespread and they affect many modern processors. Mitigating the effect of Meltdown is relatively straightforward in that it entails a software-based fix which has already been deployed by major OS vendors. However, to this day, there is no effective mitigation to Spectre. Fixing the problem may require a redesign of the architecture for conditional execution in future processors. In addition, a Spectre attack is hard to detect using traditional software-based antivirus techniques because it does not leave traces in traditional log files. In this paper, we proposed to monitor microarchitectural events such as cache misses, branch mispredictions from existing CPU performance counters to detect Spectre during attack runtime. Our detector was able to achieve 0% false negatives with less than 1 % false positives using various machine learning classifiers with a reasonable performance overhead.","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126253205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
A Fault-Tolerant Agent-Based Architecture for Transient Servers in Fog Computing 雾计算中瞬态服务器基于agent的容错体系结构
J. P. A. Neto, D. Pianto, C. Ralha
{"title":"A Fault-Tolerant Agent-Based Architecture for Transient Servers in Fog Computing","authors":"J. P. A. Neto, D. Pianto, C. Ralha","doi":"10.1109/CAHPC.2018.8645859","DOIUrl":"https://doi.org/10.1109/CAHPC.2018.8645859","url":null,"abstract":"Cloud datacenters are exploring their idle resources and offering virtual machine as transient servers without availability guarantees. Spot instances are transient servers offered by Amazon AWS, with rules that define prices according to supply and demand. These instances will run for as long as the current price is lower than the maximum bid price given by users. Spot instances have been increasingly used for executing computation and memory intensive applications. By using dynamic fault tolerant mechanisms and appropriate strategies, users can effectively use spot instances to run applications at a cheaper price. This paper presents a resilient multi-strategy agent-based cloud computing architecture. The architecture combines machine learning and a statistical model to predict instance survival times, refine fault tolerance parameters and reduce total execution time. We evaluate our strategies and the experiments demonstrate high levels of accuracy, reaching a 94% survival prediction success rate, which indicates that the model can be effectively used to define execution strategies to prevent failures at revocation events under realistic working conditions.","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116600016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Variable-Size Batched Condition Number Calculation on GPUs gpu上的可变大小批处理条件数计算
H. Anzt, J. Dongarra, Goran Flegar, Thomas Grützmacher
{"title":"Variable-Size Batched Condition Number Calculation on GPUs","authors":"H. Anzt, J. Dongarra, Goran Flegar, Thomas Grützmacher","doi":"10.1109/CAHPC.2018.8645907","DOIUrl":"https://doi.org/10.1109/CAHPC.2018.8645907","url":null,"abstract":"We present a kernel that is designed to quickly compute the condition number of a large collection of tiny matrices on a graphics processing unit (GPU). The matrices can differ in size and the process integrates the use of pivoting to ensure a numerically-stable matrix inversion. The performance assessment reveals that, in double precision arithmetic, the new GPU kernel achieves up to 550 GFLOPs (billions of floating-point operations per second) and 800 GFLOPs on NVIDIA's P100 and V100 GPUs, respectively. The results also demonstrate a considerable speed-up with respect to a workflow that computes the condition number via launching a set of four batched kernels. In addition, we present a variable-size batched kernel for the computation of the matrix infinity norm. We show that this memory-bound kernel achieves up to 90% of the sustainable peak bandwidth.","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"261 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122689289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Copyright 版权
{"title":"Copyright","authors":"","doi":"10.1109/cahpc.2018.8645922","DOIUrl":"https://doi.org/10.1109/cahpc.2018.8645922","url":null,"abstract":"","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115785940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multicore Performance Engineering of Sparse Triangular Solves Using a Modified Roofline Model 基于改进rooline模型的稀疏三角形解的多核性能工程
M. Wittmann, G. Hager, R. Janalík, M. Lanser, A. Klawonn, O. Rheinbach, O. Schenk, G. Wellein
{"title":"Multicore Performance Engineering of Sparse Triangular Solves Using a Modified Roofline Model","authors":"M. Wittmann, G. Hager, R. Janalík, M. Lanser, A. Klawonn, O. Rheinbach, O. Schenk, G. Wellein","doi":"10.1109/CAHPC.2018.8645938","DOIUrl":"https://doi.org/10.1109/CAHPC.2018.8645938","url":null,"abstract":"The Roofline model is widely used to visualize the performance of executed code together with the upper performance bounds given by the memory bandwidth and the processor peak performance. The model can thus provide an insightful visualization of bottlenecks. In this paper, we try to establish realistic bandwidth ceilings for the sparse triangular solve step of PARDISO, a leading sparse direct solver package, which is also part of the Intel MKL library. The performance of the forward and backward substitution process is analyzed and benchmarked for a representative set of sparse matrices on seven modern x86-type multicore architectures and the Knights Landing manycore architecture. It is shown how to accurately measure the necessary quantities also for threaded code, and the measurement approach, its validation, as well as limitations are discussed. Our modeling approach covers the serial and parallel execution phases, allowing for in-socket performance predictions.","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126035664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Energy Efficient Parallel K-Means Clustering for an Intel® Hybrid Multi-Chip Package 英特尔®混合多芯片封装的高能效并行K-Means聚类
M. Souza, L. Maciel, Pedro Henrique Penna, H. Freitas
{"title":"Energy Efficient Parallel K-Means Clustering for an Intel® Hybrid Multi-Chip Package","authors":"M. Souza, L. Maciel, Pedro Henrique Penna, H. Freitas","doi":"10.1109/CAHPC.2018.8645850","DOIUrl":"https://doi.org/10.1109/CAHPC.2018.8645850","url":null,"abstract":"FPGA devices have been proving to be good candidates to accelerate applications from different research topics. For instance, machine learning applications such as K-Means clustering usually relies on large amount of data to be processed, and, despite the performance offered by other architectures, FPGAs can offer better energy efficiency. With that in mind, Intel has launched a platform that integrates a multicore and an FPGA in the same package, enabling low latency and coherent fine-grained data offload. In this paper, we present a parallel implementation of the K-Means clustering algorithm, for this novel platform, using OpenCL language, and compared it against other platforms. We found that the CPU+FPGA platform was more energy efficient than the CPU-only approach from 70.71% to 85.92%, with Standard and Tiny input sizes respectively, and up to 68.21% of performance improvement was obtained with Tiny input size. Furthermore, it was up to 7.2×more energy efficient than an Intel® Xeon Phi ™, 21.5×than a cluster of Raspberry Pi boards, and 3.8×than the low-power MPPA-256 architecture, when the Standard input size was used.","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125216432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Serendipity: How Supercomputing Technology is Enabling a Revolution in Artificial Intelligence 《意外发现:超级计算技术如何推动人工智能革命
José Moreira
{"title":"Serendipity: How Supercomputing Technology is Enabling a Revolution in Artificial Intelligence","authors":"José Moreira","doi":"10.1109/cahpc.2018.8645849","DOIUrl":"https://doi.org/10.1109/cahpc.2018.8645849","url":null,"abstract":"","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132946896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Program Committees 程序委员会
{"title":"Program Committees","authors":"","doi":"10.1109/cahpc.2018.8645915","DOIUrl":"https://doi.org/10.1109/cahpc.2018.8645915","url":null,"abstract":"","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133133034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Ray-Tracer Cloud Offloading in OPENMP 自动光线追踪云卸载在OPENMP
M. Mortatti, H. Yviquel, G. Araújo
{"title":"Automatic Ray-Tracer Cloud Offloading in OPENMP","authors":"M. Mortatti, H. Yviquel, G. Araújo","doi":"10.1109/CAHPC.2018.8645871","DOIUrl":"https://doi.org/10.1109/CAHPC.2018.8645871","url":null,"abstract":"Rendering an image from a 3D scene requires a large amount of computation which grows exponentially with the complexity of the scene (e.g. number of objects and light sources). With the increasing demand of high definition content, 3D designers need to use high-performance computer systems to keep the rendering time acceptable. Since owning computer clusters is expensive, designers usually rent computing power directly from cloud service providers (e.g, AWS and Azure). However, even though many cloud providers already propose dedicated rendering services, integrating them within the standard workflow of modeling softwares can become a complex and cumbersome task. It typically requires exporting the project from the design software, dealing with various access control mechanisms from different clouds to upload the project, and executing the rendering remotely through command-line. Offloading computation to the cloud is a technique which can considerably simplify such tasks. To achieve that, this paper uses an extension of openMP 4.X to eliminate any major interactions with the end-user, while minimizing the complexity of cloud integration and optimizing the design workflow. It applies such approach to a ray-tracing application, a simplified version of the engines used by professional 3D modeling software (e.g. Blender). It automatically offloads the rendering process from the user computer to computer cluster within the Microsoft Azure cloud, brings the resulting images back after the computation ends and displays them directly on the screen of the user computer, thus providing a transparent programming model and good speed-ups over local execution.","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123782641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信