2019 IEEE High Performance Extreme Computing Conference (HPEC)最新文献

筛选
英文 中文
One Quadrillion Triangles Queried on One Million Processors 在一百万个处理器上查询一百万亿个三角形
2019 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2019-09-01 DOI: 10.1109/HPEC.2019.8916243
R. Pearce, Trevor Steil, Benjamin W. Priest, G. Sanders
{"title":"One Quadrillion Triangles Queried on One Million Processors","authors":"R. Pearce, Trevor Steil, Benjamin W. Priest, G. Sanders","doi":"10.1109/HPEC.2019.8916243","DOIUrl":"https://doi.org/10.1109/HPEC.2019.8916243","url":null,"abstract":"We update our prior 2017 Graph Challenge submission [7] on large scale triangle counting in distributed memory by demonstrating scaling and validation on trillion-edge scale-free graphs. We incorporate recent distributed communication optimizations developed for irregular communication workloads [1], and demonstrate scaling up to 1.5 million cores of IBM BG/Q Sequoia at LLNL. We validate our implementation using nonstochastic Kronecker graph generation where ground-truth local and global triangle counts are known, and model our Kronecker graph inputs after the Graph500 [5] R-MAT inputs. To our knowledge, our results are the largest triangle count experiments on synthetic scale-free graphs to date.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"130 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128714521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
C to D-Wave: A High-level C Compilation Framework for Quantum Annealers C - to - D-Wave:量子退火器的高级C编译框架
2019 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2019-09-01 DOI: 10.1109/HPEC.2019.8916231
Mohamed W. Hassan, S. Pakin, Wu-chun Feng
{"title":"C to D-Wave: A High-level C Compilation Framework for Quantum Annealers","authors":"Mohamed W. Hassan, S. Pakin, Wu-chun Feng","doi":"10.1109/HPEC.2019.8916231","DOIUrl":"https://doi.org/10.1109/HPEC.2019.8916231","url":null,"abstract":"A quantum annealer solves optimization problems by exploiting quantum effects. Problems are represented as Hamiltonian functions that define an energy landscape. The quantum-annealing hardware relaxes to a solution corresponding to the ground state of the energy landscape. Expressing arbitrary programming problems in terms of real-valued Hamiltonian-function coefficients is unintuitive and challenging. This paper addresses the difficulty of programming quantum annealers by presenting a compilation framework that compiles a subset of C code to a quantum machine instruction (QMI) to be executed on a quantum annealer. Our work is based on a modular software stack that facilitates programming D-Wave quantum annealers by successively lowering code from C to Verilog to a symbolic “quantum macro assembly language” and finally to a device-specific Hamiltonian function. We demonstrate the capabilities of our software stack on a set of problems written in C and executed on a D-Wave 2000Q quantum annealer.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"277 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123432144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Synthesis of Hardware Sandboxes for Trojan Mitigation in Systems on Chip 片上系统木马防护硬件沙箱的综合
2019 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2019-09-01 DOI: 10.1109/HPEC.2019.8916526
C. Bobda, Taylor J. L. Whitaker, Joel Mandebi Mbongue, S. Saha
{"title":"Synthesis of Hardware Sandboxes for Trojan Mitigation in Systems on Chip","authors":"C. Bobda, Taylor J. L. Whitaker, Joel Mandebi Mbongue, S. Saha","doi":"10.1109/HPEC.2019.8916526","DOIUrl":"https://doi.org/10.1109/HPEC.2019.8916526","url":null,"abstract":"In this work, we propose a high-level synthesis approach for hardware sandboxes in system-on-chip. Using interface formalism to capture interactions between non-trusted IPs and trusted parts of a system on chip, along with the properties specification language to specify non-authorized actions of non-trusted IPs, sandboxes are generated and made ready for inclusion as IP in a system-on-chip design. The concepts of composition, compatibility, and refinement are used to capture illegal actions and optimize resources across the boundary of single IPs. We have designed a tool that automatically generates the sandbox and facilitates their integration into system-on-chip. Our approach was validated with benchmarks from trust-hub.com and FPGA implementations. All our results showed 100% Trojan detection and mitigation, with only a minimal increase in resource overhead and no performance decrease.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122833034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Update on k-truss Decomposition on GPU 更新了GPU上的k-truss分解
2019 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2019-09-01 DOI: 10.1109/HPEC.2019.8916285
M. Almasri, Omer Anjum, Carl Pearson, Zaid Qureshi, Vikram Sharma Mailthody, R. Nagi, Jinjun Xiong, Wen-mei W. Hwu
{"title":"Update on k-truss Decomposition on GPU","authors":"M. Almasri, Omer Anjum, Carl Pearson, Zaid Qureshi, Vikram Sharma Mailthody, R. Nagi, Jinjun Xiong, Wen-mei W. Hwu","doi":"10.1109/HPEC.2019.8916285","DOIUrl":"https://doi.org/10.1109/HPEC.2019.8916285","url":null,"abstract":"In this paper, we present an update to our previous submission on k-truss decomposition from Graph Challenge 2018. For single k k-truss implementation, we propose multiple algorithmic optimizations that significantly improve performance by up to 35.2x (6.9x on average) compared to our previous GPU implementation. In addition, we present a scalable multi-GPU implementation in which each GPU handles a different ‘k’ value. Compared to our prior multi-GPU implementation, the proposed approach is faster by up to 151.3x (78.8x on average). In case when the edges with only maximal k-truss are sought, incrementing the ‘k’ value in each iteration is inefficient particularly for graphs with large maximum k-truss. Thus, we propose binary search for the ‘k’ value to find the maximal k-truss. The binary search approach on a single GPU is up to 101.5 (24.3x on average) faster than our 2018 k-truss submission. Lastly, we show that the proposed binary search finds the maximum k-truss for “Twitter“ graph dataset having 2.8 billion bidirectional edges in just 16 minutes on a single V100 GPU.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131927312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Scaling and Quality of Modularity Optimization Methods for Graph Clustering 图聚类的模块化优化方法的尺度和质量
2019 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2019-09-01 DOI: 10.1109/HPEC.2019.8916299
Sayan Ghosh, M. Halappanavar, Antonino Tumeo, A. Kalyanaraman
{"title":"Scaling and Quality of Modularity Optimization Methods for Graph Clustering","authors":"Sayan Ghosh, M. Halappanavar, Antonino Tumeo, A. Kalyanaraman","doi":"10.1109/HPEC.2019.8916299","DOIUrl":"https://doi.org/10.1109/HPEC.2019.8916299","url":null,"abstract":"Real-world graphs exhibit structures known as “communities” or “clusters” consisting of a group of vertices with relatively high connectivity between them, as compared to the rest of the vertices in the network. Graph clustering or community detection is a fundamental graph operation used to analyze real-world graphs occurring in the areas of computational biology, cybersecurity, electrical grids, etc. Similar to other graph algorithms, owing to irregular memory accesses and inherently sequential nature, current algorithms for community detection are challenging to parallelize. However, in order to analyze large networks, it is important to develop scalable parallel implementations of graph clustering that are capable of exploiting the architectural features of modern supercomputers.In response to the 2019 Streaming Graph Challenge, we present quality and performance analysis of our distributed-memory community detection using Vite, which is our distributed memory implementation of the popular Louvain method, on the ALCF Theta supercomputer.Clustering methods such as Louvain that rely on modularity maximization are known to suffer from the resolution limit problem, preventing identification of clusters of certain sizes. Hence, we also include quality analysis of our shared-memory implementation of the Fast-tracking Resistance method, in comparison with Louvain on the challenge datasets.Furthermore, we introduce an edge-balanced graph distribution for our distributed memory implementation, that significantly reduces communication, offering up to 80% improvement in the overall execution time. In addition to performance/quality analysis, we also include details on the power/energy consumption, and memory traffic of the distributed-memory clustering implementation using real-world graphs with over a billion edges.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"347 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124288977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
HPEC 2019 Title Page HPEC 2019标题页
2019 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2019-09-01 DOI: 10.1109/hpec.2019.8916315
{"title":"HPEC 2019 Title Page","authors":"","doi":"10.1109/hpec.2019.8916315","DOIUrl":"https://doi.org/10.1109/hpec.2019.8916315","url":null,"abstract":"","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115142422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Many-target, Many-sensor Ship Tracking and Classification 多目标、多传感器舰船跟踪与分类
2019 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2019-09-01 DOI: 10.1109/HPEC.2019.8916332
Leonard Kosta, John Irvine, Laura Seaman, H. Xi
{"title":"Many-target, Many-sensor Ship Tracking and Classification","authors":"Leonard Kosta, John Irvine, Laura Seaman, H. Xi","doi":"10.1109/HPEC.2019.8916332","DOIUrl":"https://doi.org/10.1109/HPEC.2019.8916332","url":null,"abstract":"Government agencies such as DARPA wish to know the numbers, locations, tracks, and types of vessels moving through strategically important regions of the ocean. We implement a multiple hypothesis testing algorithm to simultaneously track dozens of ships with longitude and latitude data from many sensors, then use a combination of behavioral fingerprinting and deep learning techniques to classify each vessel by type. The number of targets is unknown a priori. We achieve both high track purity and high classification accuracy on several datasets.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122510712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph Algorithms in PGAS: Chapel and UPC++ PGAS中的图算法:Chapel和upc++
2019 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2019-09-01 DOI: 10.1109/HPEC.2019.8916309
Louis Jenkins, J. Firoz, Marcin Zalewski, C. Joslyn, Mark Raugas
{"title":"Graph Algorithms in PGAS: Chapel and UPC++","authors":"Louis Jenkins, J. Firoz, Marcin Zalewski, C. Joslyn, Mark Raugas","doi":"10.1109/HPEC.2019.8916309","DOIUrl":"https://doi.org/10.1109/HPEC.2019.8916309","url":null,"abstract":"The Partitioned Global Address Space (PGAS) programming model can be implemented either with programming language features or with runtime library APIs, each implementation favoring different aspects (e.g., productivity, abstraction, flexibility, or performance). Certain language and runtime features, such as collectives, explicit and asynchronous communication primitives, and constructs facilitating overlap of communication and computation (such as futures and conjoined futures) can enable better performance and scaling for irregular applications, in particular for distributed graph analytics. We compare graph algorithms in one of each of these environments: the Chapel PGAS programming language and the the UPC++ PGAS runtime library. We implement algorithms for breadth-first search and triangle counting graph kernels in both environments. We discuss the code in each of the environments, and compile performance data on a Cray Aries and an Infiniband platform. Our results show that the library-based approach of UPC++ currently provides strong performance while Chapel provides a high-level abstraction that, harder to optimize, still provides comparable performance.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"297-301 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130817903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Survey on Hardware Security Techniques Targeting Low-Power SoC Designs 针对低功耗SoC设计的硬件安全技术综述
2019 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2019-09-01 DOI: 10.1109/HPEC.2019.8916486
Alan Ehret, K. Gettings, B. R. Jordan, M. Kinsy
{"title":"A Survey on Hardware Security Techniques Targeting Low-Power SoC Designs","authors":"Alan Ehret, K. Gettings, B. R. Jordan, M. Kinsy","doi":"10.1109/HPEC.2019.8916486","DOIUrl":"https://doi.org/10.1109/HPEC.2019.8916486","url":null,"abstract":"In this work, we survey hardware-based security techniques applicable to low-power system-on-chip designs. Techniques related to a system’s processing elements, volatile main memory and caches, non-volatile memory and on-chip interconnects are examined. Threat models for each subsystem and technique are considered. Performance overheads and other trade-offs for each technique are discussed. Defenses with similar threat models are compared.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122896317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Fast and Scalable Distributed Tensor Decompositions 快速和可扩展的分布张量分解
2019 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2019-09-01 DOI: 10.1109/HPEC.2019.8916319
M. Baskaran, Thomas Henretty, J. Ezick
{"title":"Fast and Scalable Distributed Tensor Decompositions","authors":"M. Baskaran, Thomas Henretty, J. Ezick","doi":"10.1109/HPEC.2019.8916319","DOIUrl":"https://doi.org/10.1109/HPEC.2019.8916319","url":null,"abstract":"Tensor decomposition is a prominent technique for analyzing multi-attribute data and is being increasingly used for data analysis in different application areas. Tensor decomposition methods are computationally intense and often involve irregular memory accesses over large-scale sparse data. Hence it becomes critical to optimize the execution of such data intensive computations and associated data movement to reduce the eventual time-to-solution in data analysis applications. With the prevalence of using advanced high-performance computing (HPC) systems for data analysis applications, it is becoming increasingly important to provide fast and scalable implementation of tensor decompositions and execute them efficiently on modern and advanced HPC systems. In this paper, we present distributed tensor decomposition methods that achieve faster, memory-efficient, and communication-reduced execution on HPC systems. We demonstrate that our techniques reduce the overall communication and execution time of tensor decomposition methods when they are used for analyzing datasets of varied size from real application. We illustrate our results on HPE Superdome Flex server, a high-end modular system offering large-scale in-memory computing, and on a distributed cluster of Intel Xeon multi-core nodes.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128037102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信