2018 IEEE 25th International Conference on High Performance Computing (HiPC)最新文献

筛选
英文 中文
Dynamic Count-Min Sketch for Analytical Queries Over Continuous Data Streams 连续数据流分析查询的动态计数最小草图
2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00033
Xiaobo Zhu, Guangjun Wu, Hong Zhang, Shupeng Wang, Bingnan Ma
{"title":"Dynamic Count-Min Sketch for Analytical Queries Over Continuous Data Streams","authors":"Xiaobo Zhu, Guangjun Wu, Hong Zhang, Shupeng Wang, Bingnan Ma","doi":"10.1109/HiPC.2018.00033","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00033","url":null,"abstract":"The methods of approximate query processing have been proposed for analytics over high-speed data streams, which compact continuous streams into a space-constrained sketch and provide reliable estimates for different queries. Count-Min (CM) is the state-of-the-art sketching structure supporting many queries with error-guaranteed estimates under limited space. However, we need to create a counter table beforehand in CM according to the size of data streams, while it is usually unpredictable for dynamic data streams. In this paper, we proposed an approach, called Dynamic Count-Min sketch (DCM), which is appropriate for dynamic data set and can provide accurate estimates for point query and self-join size query. Our approach constitutes incremental CM sketches and allocates space in a pay-as-you-go manner. Our mathematical analysis and substantial experiments both show that our approach is appropriate for data sets with dynamic or skewed inputs and can provide error-guaranteed estimates with less space compared to CM.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115450079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Making Strassen Matrix Multiplication Safe 使Strassen矩阵乘法安全
2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00028
Himeshi De Silva, J. Gustafson, W. Wong
{"title":"Making Strassen Matrix Multiplication Safe","authors":"Himeshi De Silva, J. Gustafson, W. Wong","doi":"10.1109/HiPC.2018.00028","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00028","url":null,"abstract":"Strassen's recursive algorithm for matrix-matrix multiplication has seen slow adoption in practical applications despite being asymptotically faster than the traditional algorithm. A primary cause for this is the comparatively weaker numerical stability of its results. Techniques that aim to improve the errors of Strassen stand the risk of losing any potential performance gain. Moreover, current methods of evaluating such techniques for safety are overly pessimistic or error prone and generally do not allow for quick and accurate comparisons. In this paper we present an efficient technique to obtain rigorous error bounds for floating point computations based on an implementation of unum arithmetic. Using it, we evaluate three techniques - exact dot product, fused multiply-add, and matrix quadrant rotation - that can potentially improve the numerical stability of Strassen's algorithm for practical use. We also propose a novel error-based heuristic rotation scheme for matrix quadrant rotation. Finally we apply techniques that improve numerical safety with low overhead to a LINPACK linear solver to demonstrate the usefulness of the Strassen algorithm in practice.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122244576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Achieving Performance and Programmability for MapReduce(-Like) Frameworks 实现MapReduce(类)框架的性能和可编程性
2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00043
Jiayang Guo, G. Agrawal
{"title":"Achieving Performance and Programmability for MapReduce(-Like) Frameworks","authors":"Jiayang Guo, G. Agrawal","doi":"10.1109/HiPC.2018.00043","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00043","url":null,"abstract":"Programmability and performance are often considered alternatives in the context of HPC programming systems. For example, general purpose frameworks like MPI are associated with high performance, and though MapReduce and similar frameworks have demonstrated high programmability, it is also well accepted that they fall short in terms of performance. Providing abstractions that maintain high programmability and performance remains an open question. In this paper, we introduce two different variations of the original MapReduce API, We demonstrate efficient implementations of the three APIs, focusing on how the API differences impact middleware implementation, and examine the resulting performance. Furthermore, to understand how application characteristics impact relative performance of the three systems, we develop and validate a performance model. Overall, we show that a MapReduce-like AP that only requires small additional effort from programmers can provide high performance, outperforming Spark significantly.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"28 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120879934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Acceleration of an Adaptive Cartesian Mesh CFD Solver in the Current Generation Processor Architectures 自适应笛卡尔网格CFD求解器在当前处理器体系结构中的加速
2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00025
V. HarichandM., Bharatkumar Sharma, G. Sudhakaran, V. Ashok
{"title":"Acceleration of an Adaptive Cartesian Mesh CFD Solver in the Current Generation Processor Architectures","authors":"V. HarichandM., Bharatkumar Sharma, G. Sudhakaran, V. Ashok","doi":"10.1109/HiPC.2018.00025","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00025","url":null,"abstract":"In this paper, the challenges involved in the acceleration of an adaptive Cartesian Mesh CFD Solver PARAS-3D in the current generation processors(CPUs & GPUs) is explored. CFD codes are known for their memory bound nature, which remains as a significant bottle-neck in achieving higher performance. Adaptive Cartesian meshes with their oct-tree structure brings about more challenges in data parallelism. Moreover, Cartesian mesh solvers have higher memory band-width requirements due to their larger and varying stencil. The paper will detail how a re-design and implementation of a legacy Cartesian mesh CFD solver helped in achieving higher performance in CPUs by improvements in algorithms and data structures. Moreover, very good scalability to thousands of cores was achieved using asynchronous communication and weighted graph partitioning. A Structure of Array based data layout along with GPU features like Unified memory and Multi Process Service was used in the GPU acceleration process to obtain a performance of 4.4 X on top of the CPU only version by using nVidia Quadro GV100 GPUs.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125432978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vidya: Performing Code-Block I/O Characterization for Data Access Optimization 为数据访问优化执行代码块I/O表征
2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00036
H. Devarajan, Anthony Kougkas, Prajwal Challa, Xian-He Sun
{"title":"Vidya: Performing Code-Block I/O Characterization for Data Access Optimization","authors":"H. Devarajan, Anthony Kougkas, Prajwal Challa, Xian-He Sun","doi":"10.1109/HiPC.2018.00036","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00036","url":null,"abstract":"Understanding, characterizing and tuning scientific applications' I/O behavior is an increasingly complicated process in HPC systems. Existing tools use either offline profiling or online analysis to get insights into the applications' I/O patterns. However, there is lack of a clear formula to characterize applications' I/O. Moreover, these tools are application specific and do not account for multi-tenant systems. This paper presents Vidya, an I/O profiling framework which can predict application's I/O intensity using a new formula called Code-Block I/O Characterization (CIOC). Using CIOC, developers and system architects can tune an application's I/O behavior and better match the underlying storage system to maximize performance. Evaluation results show that Vidya can predict an application's I/O intensity with a variance of 0.05%. Vidya can profile applications with a high accuracy of 98% while reducing profiling time by 9x. We further show how Vidya can optimize an application's I/O time by 3.7x.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121767375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Title Page i 第1页
2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/hipc.2018.00001
{"title":"Title Page i","authors":"","doi":"10.1109/hipc.2018.00001","DOIUrl":"https://doi.org/10.1109/hipc.2018.00001","url":null,"abstract":"","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130253187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synchronization-Avoiding Graph Algorithms 避免同步的图算法
2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00015
J. Firoz, Marcin Zalewski, Thejaka Amila Kanewala, A. Lumsdaine
{"title":"Synchronization-Avoiding Graph Algorithms","authors":"J. Firoz, Marcin Zalewski, Thejaka Amila Kanewala, A. Lumsdaine","doi":"10.1109/HiPC.2018.00015","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00015","url":null,"abstract":"Because they were developed for optimal sequential complexity, classical graph algorithms as found in textbooks have strictly-defined orders of operations. Enforcing a prescribed order of operations, or even an approximate order, in a distributed memory setting requires significant amounts of synchronization, which in turn can severely limit scalability. As a result, new algorithms are typically required to achieve scalable performance, even for solving well-known graph problems. Yet, even in these cases, parallel graph algorithms are written according to parallel programming models that evolved for, e.g., scientific computing, and that still have inherent, and scalability-limiting, amounts of synchronization. In this paper we present a new approach to parallel graph algorithms: synchronization-avoiding algorithms. To eliminate synchronization and its associated overhead, synchronization-avoiding algorithms perform work in an unordered and fully asynchronous fashion in such a way that the result is constantly refined toward its final state. \"Wasted\" work is minimized by locally prioritizing tasks using problem-dependent task utility metrics. We classify algorithms for graph applications into two broad categories: algorithms with monotonic updates (which evince global synchronization) and algorithms with non-monotonic updates (which evince vertex-centric synchronization). We apply our approach to both classes and develop novel, synchronization-avoiding algorithms for solving exemplar problems: SSSP and connected components for the former, graph coloring for the latter. We demonstrate that eliminating synchronization in conjunction with effective scheduling policies and optimizations in the runtime results in improved scalability for both classes of algorithms.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132100606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Characterization of the Impact of Soft Errors on Iterative Methods 软误差对迭代方法影响的表征
2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00031
B. O. Mutlu, Gokcen Kestor, J. Manzano, O. Unsal, S. Chatterjee, S. Krishnamoorthy
{"title":"Characterization of the Impact of Soft Errors on Iterative Methods","authors":"B. O. Mutlu, Gokcen Kestor, J. Manzano, O. Unsal, S. Chatterjee, S. Krishnamoorthy","doi":"10.1109/HiPC.2018.00031","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00031","url":null,"abstract":"Soft errors caused by transient bit flips have the potential to significantly impact an application's behavior. This has motivated the design of an array of techniques to detect, isolate, and correct soft errors using microarchitectural, architectural, compilation-based, or application-level techniques to minimize their impact on the executing application. The first step toward the design of good error detection/correction techniques involves an understanding of an application's vulnerability to soft errors. In this paper, we present the first comprehensive characterization of the impact of soft errors on the convergence characteristics of six iterative methods using application-level fault injection. In particular, we consider the use of iterative methods to incrementally solve a linear system of equations, which constitute the core kernel in many scientific applications. We analyze the impact of soft errors in terms of the type of error (single-vs multi-bit), the distribution and location of bits affected, the data structure and statement impacted, and variation with time. In addition to understanding the vulnerability of iterative solvers to soft errors, this characterization can aid the design of fault injection campaigns that ensure systematic coverage.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125953800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Secure High-Performance Computer Architectures: Challenges and Opportunities 安全高性能计算机体系结构:挑战与机遇
2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00038
S. Devadas
{"title":"Secure High-Performance Computer Architectures: Challenges and Opportunities","authors":"S. Devadas","doi":"10.1109/HiPC.2018.00038","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00038","url":null,"abstract":"Recent work has shown that architectural isolation can be violated through software side channel attacks that exploit microarchitectural performance optimizations such as speculation to leak secrets. While turning off microarchitectural optimizations can preclude some classes of attacks, we argue that performance and security do not have be in conflict, provided processors are designed with security in mind. We espouse a principled approach to eliminating entire attack surfaces through microarchitectural isolation, rather than plugging attack-specific privacy leaks. We argue that minimal modifications to hardware can defend against all currently-practical side channel attacks and without significant performance impact. As an application of this approach, we describe the Sanctum processor architecture that offers strong provable isolation of software modules running concurrently and sharing resources, and Sanctoom, a speculative, out-of-order variant with similar properties. These processors provide isolation even when large parts of the operating system are compromised, and their open-source implementations allow security properties to be independently verified. Biography Srini Devadas is the Webster Professor of EECS at MIT where he has been on the faculty since 1988. His current research interests are in computer security, computer architecture and applied cryptography. Devadas received the 2017 IEEE W. Wallace McDowell award and the 2018 IEEE Charles A. Desoer Technical Achievement award for his research in secure hardware. He is the author of “Programming for the Puzzled” (MIT Press, 2017), a book that builds a bridge between the recreational world of algorithmic puzzles and the pragmatic world of computer programming, teaching readers to program while solving puzzles. Devadas is a MacVicar Faculty Fellow, an Everett Moore Baker and a Bose award recipient, considered MIT’s highest teaching honors. 275 2018 IEEE 25th International Conference on High Performance Computing (HiPC) DOI 10.1109/HiPC.2018.00038","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129265280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Looking Under the Hood of Deep Neural Networks 深入研究深度神经网络
2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI: 10.1109/HiPC.2018.00009
Balaraman Ravindran
{"title":"Looking Under the Hood of Deep Neural Networks","authors":"Balaraman Ravindran","doi":"10.1109/HiPC.2018.00009","DOIUrl":"https://doi.org/10.1109/HiPC.2018.00009","url":null,"abstract":"Treating deep neural networks as black boxes and using them as-is from a toolbox could potentially lead to sub-optimal performance. Increasingly machine learning researchers have to be more aware of the computational workloads entailed by their models and how to optimize for them. In this talk, I will describe three different pieces of our recent work with deep convolutional networks and their variants in improving inference performance across a variety of tasks like object detection, identification, tracking, etc. These studies demonstrate the need for peeling back the cover and paying attention to the computation even when using standard models. Biography Balaram Ravindran is a professor at the Department of Computer Science and Engineering, and the head of the Robert Bosch Centre for Data Science and AI at the Indian Institute of Technology Madras. His current research interests span the broader area of machine learning, ranging from Spatio-temporal Abstractions in Reinforcement Learning to social network analysis and Data/Text Mining. Much of the work in his group is directed toward understanding interactions and learning from them. 1 2018 IEEE 25th International Conference on High Performance Computing (HiPC) DOI 10.1109/HiPC.2018.00009","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131226814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信