Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis最新文献_第10页

Significantly improving lossy compression quality based on an optimized hybrid prediction model 基于优化的混合预测模型显著提高有损压缩质量

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2019-11-17 DOI: 10.1145/3295500.3356193

Xin Liang, S. Di, Sihuan Li, Dingwen Tao, Bogdan Nicolae, Zizhong Chen, F. Cappello

{"title":"Significantly improving lossy compression quality based on an optimized hybrid prediction model","authors":"Xin Liang, S. Di, Sihuan Li, Dingwen Tao, Bogdan Nicolae, Zizhong Chen, F. Cappello","doi":"10.1145/3295500.3356193","DOIUrl":"https://doi.org/10.1145/3295500.3356193","url":null,"abstract":"With the ever-increasing volumes of data produced by today's large-scale scientific simulations, error-bounded lossy compression techniques have become critical: not only can they significantly reduce the data size but they also can retain high data fidelity for postanalysis. In this paper, we design a strategy to improve the compression quality significantly based on an optimized, hybrid prediction model. Our contribution is fourfold. (1) We propose a novel, transform-based predictor and optimize its compression quality. (2) We significantly improve the coefficient-encoding efficiency for the data-fitting predictor. (3) We propose an adaptive framework that can select the best-fit predictor accurately for different datasets. (4) We evaluate our solution and several existing state-of-the-art lossy compressors by running real-world applications on a supercomputer with 8,192 cores. Experiments show that our adaptive compressor can improve the compression ratio by 112~165% compared with the second-best compressor. The parallel I/O performance is improved by about 100% because of the significantly reduced data size. The total I/O time is reduced by up to 60X with our compressor compared with the original I/O time.","PeriodicalId":124077,"journal":{"name":"Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"28 10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124136668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

LPCC

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2019-11-17 DOI: 10.1145/3295500.3356139

Y. Qian, Xi Li, Shu Ihara, A. Dilger, C. Thomaz, Shilong Wang, Wen Cheng, Chunyan Li, Lingfang Zeng, Fang Wang, Danfeng Feng, Tim Süß, A. Brinkmann

{"title":"LPCC","authors":"Y. Qian, Xi Li, Shu Ihara, A. Dilger, C. Thomaz, Shilong Wang, Wen Cheng, Chunyan Li, Lingfang Zeng, Fang Wang, Danfeng Feng, Tim Süß, A. Brinkmann","doi":"10.1145/3295500.3356139","DOIUrl":"https://doi.org/10.1145/3295500.3356139","url":null,"abstract":"Most high-performance computing (HPC) clusters use a global parallel file system to enable high data throughput. The parallel file system is typically centralized and its storage media are physically separated from the compute cluster. Compute nodes as clients of the parallel file system are often additionally equipped with SSDs. The node internal storage media are rarely well-integrated into the I/O and compute workflows. How to make full and flexible use of these storage media is therefore a valuable research question. In this paper, we propose a hierarchical Persistent Client Caching (LPCC) mechanism for the Lustre file system. LPCC provides two modes: RW-PCC builds a read-write cache on the local SSD of a single client; RO-PCC distributes a read-only cache over the SSDs of multiple clients. LPCC integrates with the Lustre HSM solution and the Lustre layout lock mechanism to provide consistent persistent caching services for I/O applications running on client nodes, meanwhile maintaining a global unified namespace of the entire Lustre file system. The evaluation results presented in this paper show LPCC's advantages for various workloads, enabling even speed-ups linear in the number of clients for several real-world scenarios.","PeriodicalId":124077,"journal":{"name":"Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126545965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Adaptive neural network-based approximation to accelerate eulerian fluid simulation 基于自适应神经网络的加速欧拉流体模拟

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2019-11-17 DOI: 10.1145/3295500.3356147

Wenqian Dong, Jie Liu, Zhen Xie, Dong Li

{"title":"Adaptive neural network-based approximation to accelerate eulerian fluid simulation","authors":"Wenqian Dong, Jie Liu, Zhen Xie, Dong Li","doi":"10.1145/3295500.3356147","DOIUrl":"https://doi.org/10.1145/3295500.3356147","url":null,"abstract":"The Eulerian fluid simulation is an important HPC application. The neural network has been applied to accelerate it. The current methods that accelerate the fluid simulation with neural networks lack flexibility and generalization. In this paper, we tackle the above limitation and aim to enhance the applicability of neural networks in the Eulerian fluid simulation. We introduce Smart-fluidnet, a framework that automates model generation and application. Given an existing neural network as input, Smart-fluidnet generates multiple neural networks before the simulation to meet the execution time and simulation quality requirement. During the simulation, Smart-fluidnet dynamically switches the neural networks to make best efforts to reach the user's requirement on simulation quality. Evaluating with 20,480 input problems, we show that Smart-fluidnet achieves 1.46x and 590x speedup comparing with a state-of-the-art neural network model and the original fluid simulation respectively on an NVIDIA Titan X Pascal GPU, while providing better simulation quality than the state-of-the-art model.","PeriodicalId":124077,"journal":{"name":"Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125686300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

Spread-n-share: improving application performance and cluster throughput with resource-aware job placement Spread-n-share:通过资源感知的作业布局提高应用程序性能和集群吞吐量

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2019-11-17 DOI: 10.1145/3295500.3356152

Xiongchao Tang, Haojie Wang, Xiaosong Ma, Nosayba El-Sayed, Jidong Zhai, Wenguang Chen, Ashraf Aboulnaga

{"title":"Spread-n-share: improving application performance and cluster throughput with resource-aware job placement","authors":"Xiongchao Tang, Haojie Wang, Xiaosong Ma, Nosayba El-Sayed, Jidong Zhai, Wenguang Chen, Ashraf Aboulnaga","doi":"10.1145/3295500.3356152","DOIUrl":"https://doi.org/10.1145/3295500.3356152","url":null,"abstract":"Traditional batch job schedulers adopt the Compact-n-Exclusive (CE) strategy, packing processes of a parallel job into as few compute nodes as possible. While CE minimizes inter-node network communication, it often brings self-contention among tasks of a resource-intensive application. Recent studies have used virtual containers to balance CPU utilization and memory capacity across physical nodes, but the imbalance in cache and memory bandwidth usage is still under-investigated. In this work, we propose Spread-n-Share (SNS): a new batch scheduling strategy that automatically scales resource-bound applications out onto more nodes to alleviate their performance bottleneck, and co-locate jobs in a resource compatible manner. We implement Uberun, a prototype scheduler to validate SNS, considering shared-cache capacity and memory bandwidth as two types of performance-critical shared resources. Experimental results using 12 diverse cluster workloads show that SNS improves the overall system throughput by 19.8% on average over CE, while achieving an average individual job speedup of 1.8%.","PeriodicalId":124077,"journal":{"name":"Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124872063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

ComDetective: a lightweight communication detection tool for threads ComDetective:一个轻量级的线程通信检测工具

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2019-11-17 DOI: 10.1145/3295500.3356214

Muhammad Aditya Sasongko, Milind Chabbi, Palwisha Akhtar, D. Unat

{"title":"ComDetective: a lightweight communication detection tool for threads","authors":"Muhammad Aditya Sasongko, Milind Chabbi, Palwisha Akhtar, D. Unat","doi":"10.1145/3295500.3356214","DOIUrl":"https://doi.org/10.1145/3295500.3356214","url":null,"abstract":"Inter-thread communication is a vital performance indicator in shared-memory systems. Prior works on identifying inter-thread communication employed hardware simulators or binary instrumentation and suffered from inaccuracy or high overheads---both space and time---making them impractical for production use. We propose ComDetective, which produces communication matrices that are accurate and introduces low runtime and low memory overheads, thus making it practical for production use. ComDetective employs hardware performance counters to sample memory-access events and uses hardware debug registers to sample communicating pairs of threads. ComDetective can differentiate communication as true or false sharing between threads. Its runtime and memory overheads are only 1.30X and 1.27X, respectively, for the 18 applications studied under 500K sampling period. Using ComDetective, we produce insightful communication matrices for microbenchmarks, PARSEC benchmark suite, and several CORAL applications and compare the generated matrices against MPI counterparts. Guided by ComDetective, we optimize a few codes and achieve up to 13% speedup.","PeriodicalId":124077,"journal":{"name":"Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122739960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Consensus equilibrium framework for super-resolution and extreme-scale CT reconstruction 超分辨率和极端尺度CT重建的共识平衡框架

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2019-11-17 DOI: 10.1145/3295500.3356142

Xiao Wang, V. Sridhar, Z. Ronaghi, R. Thomas, J. Deslippe, D. Parkinson, G. Buzzard, S. Midkiff, C. Bouman, S. Warfield

引用次数: 10

Exploiting reuse and vectorization in blocked stencil computations on CPUs and GPUs 利用cpu和gpu上阻塞模板计算的重用和矢量化

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2019-11-17 DOI: 10.1145/3295500.3356210

Tuowen Zhao, P. Basu, Samuel Williams, Mary W. Hall, H. Johansen

引用次数: 25

Understanding congestion in high performance interconnection networks using sampling 使用采样理解高性能互连网络中的拥塞

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2019-11-17 DOI: 10.1145/3295500.3356168

Philip Taffet, J. Mellor-Crummey

{"title":"Understanding congestion in high performance interconnection networks using sampling","authors":"Philip Taffet, J. Mellor-Crummey","doi":"10.1145/3295500.3356168","DOIUrl":"https://doi.org/10.1145/3295500.3356168","url":null,"abstract":"To improve the communication performance of an application executing on a cluster or supercomputer, developers need tools that enable them to understand how the application's communication patterns interact with the system's network, especially when those interactions result in congestion. Since communication performance is difficult to reason about analytically and simulation is costly, measurement-based approaches are needed. This paper describes a new sampling-based technique to collect information about the path a packet takes and congestion it encounters. We describe a variant of this scheme that requires only 5--6 bits of information in a monitored packet, making it practical for use in next-generation networks. Network simulations using communication traces for miniGhost (a synthetic 3D finite difference mini-application) and pF3D (a code that simulates laser-plasma interactions) show that our technique provides precise application-centric quantitative information about traffic and congestion that can be used to distinguish between problems with an application's communication patterns, its mapping onto a parallel system, and outside interference.","PeriodicalId":124077,"journal":{"name":"Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114278751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Hatchet 斧

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2019-11-17 DOI: 10.1145/3295500.3356219

A. Bhatele, S. Brink, T. Gamblin

{"title":"Hatchet","authors":"A. Bhatele, S. Brink, T. Gamblin","doi":"10.1145/3295500.3356219","DOIUrl":"https://doi.org/10.1145/3295500.3356219","url":null,"abstract":"Fri, 19 Oct 2018 19:21:00 GMT hatchet pdf HATCHET GARY PAULSEN \"Plausible, taut, this survival story is a spellbinding account.\" â€”Kirkus (starred review) Thoughts of his parents' divorce fill Brian Robeson's head as he flies in a single-engine plane to visit his Sat, 20 Oct 2018 18:37:00 GMT hatchet Scotland County R-1 SD Hatchet by Gary Paulsen (1).pdf. Hatchet by Gary Paulsen (1).pdf. Sign In. Details. Main menu. Displaying Hatchet by Gary Paulsen (1).pdf. ... Thu, 18 Oct 2018 08:38:00 GMT Hatchet by Gary Paulsen (1).pdf Google Drive Brian uses hatchet to make tools . The tools help Brian get food and build a shelter. Sun, 22 Oct 2017 20:19:00 GMT Hatchet RIC Â©1995 Educational Impressions, Inc. 1 Hatchet By Gary Paulsen L-I-T Guide Literature In Teaching A Study Guide for Grades 6 and Up Prepared by Charlotte S. Jaffe & Barbara T. Doherty Wed, 17 Oct 2018 23:49:00 GMT Hatchet By Gary Paulsen Gretna HATCHET GARY PAULSEN \"Plausible, taut, this survival story is a Publishers Weekly A Newbery Honor Book An ALA Notable Book Booklist Editor's Choice 1 Hatchet. By Gary Paulsen. L-I-T Guide. Literature In Teaching. A Study Guide for Grades 6 and Up Answer the following questions in complete sentence form. [PDF] Hatchet Full Books. Books detail. Sat, 20 Oct 2018 08:00:00 GMT Hatchet full book pdf thethirdera Tapatalk Possible Concepts Which Could be Taught From Hatchet Listed below are several concepts which could be taught in conjunction with the novel. The definitions are meant as suggestions â€“ the teacher may wish to make alterations and amendments. 1. CRISIS â€“ a time of intense difficulty or danger 2. Sun, 14 Oct 2018 04:54:00 GMT Hatchet Novel Studies Hatchet is written in third person limited point-of-view. The narrator explains Brianâ€TMs actions and many of his thoughts, but the reader does not know everything that he thinks and experiences. Sat, 13 Oct 2018 07:18:00 GMT Hatchet PowerPack Sample PDF Prestwick House Hatchet is the story of Brianâ€TMs experience in the Canadian wilderness and of his becom-ing aware of his potential. Stripped of all the comforts and conveniences of his former life, Brian must find the inner resources that will help him fight for his life and ultimately move Sat, 20 Oct 2018 10:16:00 GMT for Hatchet Glencoe Whoops! There was a problem previewing Hatchet Final Test.pdf. Retrying. Wed, 10 Oct 2018 01:17:00 GMT Hatchet Final Test.pdf Google Docs English Language Arts, Grade 6: Hatchet 3 SUMMATIVE UNIT ASSESSMENTS CULMINATING WRITING TASK2 Select an event from Hatchet and identify rians steps for survival. After reading Survival by the Numbers _ from OutdoorSafe Inc. by Peter Kummerfelt, compare rians actions against the tips included in the article . Sat, 20 Oct 2018 13:44:00 GMT UNIT: HATCHET Louisiana Believes Hatchet by Gary Paulsen.pdf HATCHETGARY PAULSEN\"Plausible, taut, this survival story is a spellbinding account.\" â€”Kirkus (starred review)Thoughts of his parents' divorce fill Brian Robeson's head as he","PeriodicalId":124077,"journal":{"name":"Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133353028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Uncore power scavenger: a runtime for uncore power conservation on HPC systems 非核心电力清道夫:一个运行时的非核心电力节约在HPC系统

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2019-11-17 DOI: 10.1145/3295500.3356150

Neha Gholkar, F. Mueller, B. Rountree

引用次数: 23