2018 IEEE 25th International Conference on High Performance Computing (HiPC)最新文献_第3页

Dynamic Count-Min Sketch for Analytical Queries Over Continuous Data Streams 连续数据流分析查询的动态计数最小草图

2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00033

Xiaobo Zhu, Guangjun Wu, Hong Zhang, Shupeng Wang, Bingnan Ma

引用次数: 1

Making Strassen Matrix Multiplication Safe 使Strassen矩阵乘法安全

2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00028

Himeshi De Silva, J. Gustafson, W. Wong

{"title":"Making Strassen Matrix Multiplication Safe","authors":"Himeshi De Silva, J. Gustafson, W. Wong","doi":"10.1109/HiPC.2018.00028","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00028","url":null,"abstract":"Strassen's recursive algorithm for matrix-matrix multiplication has seen slow adoption in practical applications despite being asymptotically faster than the traditional algorithm. A primary cause for this is the comparatively weaker numerical stability of its results. Techniques that aim to improve the errors of Strassen stand the risk of losing any potential performance gain. Moreover, current methods of evaluating such techniques for safety are overly pessimistic or error prone and generally do not allow for quick and accurate comparisons. In this paper we present an efficient technique to obtain rigorous error bounds for floating point computations based on an implementation of unum arithmetic. Using it, we evaluate three techniques - exact dot product, fused multiply-add, and matrix quadrant rotation - that can potentially improve the numerical stability of Strassen's algorithm for practical use. We also propose a novel error-based heuristic rotation scheme for matrix quadrant rotation. Finally we apply techniques that improve numerical safety with low overhead to a LINPACK linear solver to demonstrate the usefulness of the Strassen algorithm in practice.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122244576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Achieving Performance and Programmability for MapReduce(-Like) Frameworks 实现MapReduce(类)框架的性能和可编程性

2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00043

Jiayang Guo, G. Agrawal

引用次数: 4

Acceleration of an Adaptive Cartesian Mesh CFD Solver in the Current Generation Processor Architectures 自适应笛卡尔网格CFD求解器在当前处理器体系结构中的加速

2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00025

V. HarichandM., Bharatkumar Sharma, G. Sudhakaran, V. Ashok

{"title":"Acceleration of an Adaptive Cartesian Mesh CFD Solver in the Current Generation Processor Architectures","authors":"V. HarichandM., Bharatkumar Sharma, G. Sudhakaran, V. Ashok","doi":"10.1109/HiPC.2018.00025","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00025","url":null,"abstract":"In this paper, the challenges involved in the acceleration of an adaptive Cartesian Mesh CFD Solver PARAS-3D in the current generation processors(CPUs & GPUs) is explored. CFD codes are known for their memory bound nature, which remains as a significant bottle-neck in achieving higher performance. Adaptive Cartesian meshes with their oct-tree structure brings about more challenges in data parallelism. Moreover, Cartesian mesh solvers have higher memory band-width requirements due to their larger and varying stencil. The paper will detail how a re-design and implementation of a legacy Cartesian mesh CFD solver helped in achieving higher performance in CPUs by improvements in algorithms and data structures. Moreover, very good scalability to thousands of cores was achieved using asynchronous communication and weighted graph partitioning. A Structure of Array based data layout along with GPU features like Unified memory and Multi Process Service was used in the GPU acceleration process to obtain a performance of 4.4 X on top of the CPU only version by using nVidia Quadro GV100 GPUs.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125432978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Vidya: Performing Code-Block I/O Characterization for Data Access Optimization 为数据访问优化执行代码块I/O表征

2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00036

H. Devarajan, Anthony Kougkas, Prajwal Challa, Xian-He Sun

引用次数: 9

Title Page i 第1页

2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/hipc.2018.00001

引用次数: 0

Synchronization-Avoiding Graph Algorithms 避免同步的图算法

2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00015

J. Firoz, Marcin Zalewski, Thejaka Amila Kanewala, A. Lumsdaine

{"title":"Synchronization-Avoiding Graph Algorithms","authors":"J. Firoz, Marcin Zalewski, Thejaka Amila Kanewala, A. Lumsdaine","doi":"10.1109/HiPC.2018.00015","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00015","url":null,"abstract":"Because they were developed for optimal sequential complexity, classical graph algorithms as found in textbooks have strictly-defined orders of operations. Enforcing a prescribed order of operations, or even an approximate order, in a distributed memory setting requires significant amounts of synchronization, which in turn can severely limit scalability. As a result, new algorithms are typically required to achieve scalable performance, even for solving well-known graph problems. Yet, even in these cases, parallel graph algorithms are written according to parallel programming models that evolved for, e.g., scientific computing, and that still have inherent, and scalability-limiting, amounts of synchronization. In this paper we present a new approach to parallel graph algorithms: synchronization-avoiding algorithms. To eliminate synchronization and its associated overhead, synchronization-avoiding algorithms perform work in an unordered and fully asynchronous fashion in such a way that the result is constantly refined toward its final state. \"Wasted\" work is minimized by locally prioritizing tasks using problem-dependent task utility metrics. We classify algorithms for graph applications into two broad categories: algorithms with monotonic updates (which evince global synchronization) and algorithms with non-monotonic updates (which evince vertex-centric synchronization). We apply our approach to both classes and develop novel, synchronization-avoiding algorithms for solving exemplar problems: SSSP and connected components for the former, graph coloring for the latter. We demonstrate that eliminating synchronization in conjunction with effective scheduling policies and optimizations in the runtime results in improved scalability for both classes of algorithms.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132100606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Characterization of the Impact of Soft Errors on Iterative Methods 软误差对迭代方法影响的表征

2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00031

B. O. Mutlu, Gokcen Kestor, J. Manzano, O. Unsal, S. Chatterjee, S. Krishnamoorthy

{"title":"Characterization of the Impact of Soft Errors on Iterative Methods","authors":"B. O. Mutlu, Gokcen Kestor, J. Manzano, O. Unsal, S. Chatterjee, S. Krishnamoorthy","doi":"10.1109/HiPC.2018.00031","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00031","url":null,"abstract":"Soft errors caused by transient bit flips have the potential to significantly impact an application's behavior. This has motivated the design of an array of techniques to detect, isolate, and correct soft errors using microarchitectural, architectural, compilation-based, or application-level techniques to minimize their impact on the executing application. The first step toward the design of good error detection/correction techniques involves an understanding of an application's vulnerability to soft errors. In this paper, we present the first comprehensive characterization of the impact of soft errors on the convergence characteristics of six iterative methods using application-level fault injection. In particular, we consider the use of iterative methods to incrementally solve a linear system of equations, which constitute the core kernel in many scientific applications. We analyze the impact of soft errors in terms of the type of error (single-vs multi-bit), the distribution and location of bits affected, the data structure and statement impacted, and variation with time. In addition to understanding the vulnerability of iterative solvers to soft errors, this characterization can aid the design of fault injection campaigns that ensure systematic coverage.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125953800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Secure High-Performance Computer Architectures: Challenges and Opportunities 安全高性能计算机体系结构:挑战与机遇

2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00038

S. Devadas

{"title":"Secure High-Performance Computer Architectures: Challenges and Opportunities","authors":"S. Devadas","doi":"10.1109/HiPC.2018.00038","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00038","url":null,"abstract":"Recent work has shown that architectural isolation can be violated through software side channel attacks that exploit microarchitectural performance optimizations such as speculation to leak secrets. While turning off microarchitectural optimizations can preclude some classes of attacks, we argue that performance and security do not have be in conflict, provided processors are designed with security in mind. We espouse a principled approach to eliminating entire attack surfaces through microarchitectural isolation, rather than plugging attack-specific privacy leaks. We argue that minimal modifications to hardware can defend against all currently-practical side channel attacks and without significant performance impact. As an application of this approach, we describe the Sanctum processor architecture that offers strong provable isolation of software modules running concurrently and sharing resources, and Sanctoom, a speculative, out-of-order variant with similar properties. These processors provide isolation even when large parts of the operating system are compromised, and their open-source implementations allow security properties to be independently verified. Biography Srini Devadas is the Webster Professor of EECS at MIT where he has been on the faculty since 1988. His current research interests are in computer security, computer architecture and applied cryptography. Devadas received the 2017 IEEE W. Wallace McDowell award and the 2018 IEEE Charles A. Desoer Technical Achievement award for his research in secure hardware. He is the author of “Programming for the Puzzled” (MIT Press, 2017), a book that builds a bridge between the recreational world of algorithmic puzzles and the pragmatic world of computer programming, teaching readers to program while solving puzzles. Devadas is a MacVicar Faculty Fellow, an Everett Moore Baker and a Bose award recipient, considered MIT’s highest teaching honors. 275 2018 IEEE 25th International Conference on High Performance Computing (HiPC) DOI 10.1109/HiPC.2018.00038","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129265280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Looking Under the Hood of Deep Neural Networks 深入研究深度神经网络

2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00009

Balaraman Ravindran

{"title":"Looking Under the Hood of Deep Neural Networks","authors":"Balaraman Ravindran","doi":"10.1109/HiPC.2018.00009","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00009","url":null,"abstract":"Treating deep neural networks as black boxes and using them as-is from a toolbox could potentially lead to sub-optimal performance. Increasingly machine learning researchers have to be more aware of the computational workloads entailed by their models and how to optimize for them. In this talk, I will describe three different pieces of our recent work with deep convolutional networks and their variants in improving inference performance across a variety of tasks like object detection, identification, tracking, etc. These studies demonstrate the need for peeling back the cover and paying attention to the computation even when using standard models. Biography Balaram Ravindran is a professor at the Department of Computer Science and Engineering, and the head of the Robert Bosch Centre for Data Science and AI at the Indian Institute of Technology Madras. His current research interests span the broader area of machine learning, ranging from Spatio-temporal Abstractions in Reinforcement Learning to social network analysis and Data/Text Mining. Much of the work in his group is directed toward understanding interactions and learning from them. 1 2018 IEEE 25th International Conference on High Performance Computing (HiPC) DOI 10.1109/HiPC.2018.00009","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131226814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0