2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)最新文献

筛选
英文 中文
Understanding Data Motion in the Modern HPC Data Center 理解现代HPC数据中心中的数据运动
2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW) Pub Date : 2019-11-01 DOI: 10.1109/PDSW49588.2019.00012
Glenn K. Lockwood, S. Snyder, S. Byna, P. Carns, N. Wright
{"title":"Understanding Data Motion in the Modern HPC Data Center","authors":"Glenn K. Lockwood, S. Snyder, S. Byna, P. Carns, N. Wright","doi":"10.1109/PDSW49588.2019.00012","DOIUrl":"https://doi.org/10.1109/PDSW49588.2019.00012","url":null,"abstract":"The utilization and performance of storage, compute, and network resources within HPC data centers have been studied extensively, but much less work has gone toward characterizing how these resources are used in conjunction to solve larger scientific challenges. To address this gap, we present our work in characterizing workloads and workflows at a data-center-wide level by examining all data transfers that occurred between storage, compute, and the external network at the National Energy Research Scientific Computing Center over a three-month period in 2019. Using a simple abstract representation of data transfers, we analyze over 100 million transfer logs from Darshan, HPSS user interfaces, and Globus to quantify the load on data paths between compute, storage, and the wide-area network based on transfer direction, user, transfer tool, source, destination, and time. We show that parallel I/O from user jobs, while undeniably important, is only one of several major I/O workloads that occurs throughout the execution of scientific workflows. We also show that this approach can be used to connect anomalous data traffic to specific users and file access patterns, and we construct time-resolved user transfer traces to demonstrate that one can systematically identify coupled data motion for individual workflows.","PeriodicalId":130430,"journal":{"name":"2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128624976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
[Copyright notice] (版权)
2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW) Pub Date : 2019-11-01 DOI: 10.1109/pdsw49588.2019.00002
{"title":"[Copyright notice]","authors":"","doi":"10.1109/pdsw49588.2019.00002","DOIUrl":"https://doi.org/10.1109/pdsw49588.2019.00002","url":null,"abstract":"","PeriodicalId":130430,"journal":{"name":"2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114982958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Physical Design Management in Storage Systems 面向存储系统的物理设计管理
2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW) Pub Date : 2019-11-01 DOI: 10.1109/PDSW49588.2019.00009
K. Dahlgren, J. LeFevre, Ashay Shirwadkar, Ken Iizawa, Aldrin Montana, P. Alvaro, C. Maltzahn
{"title":"Towards Physical Design Management in Storage Systems","authors":"K. Dahlgren, J. LeFevre, Ashay Shirwadkar, Ken Iizawa, Aldrin Montana, P. Alvaro, C. Maltzahn","doi":"10.1109/PDSW49588.2019.00009","DOIUrl":"https://doi.org/10.1109/PDSW49588.2019.00009","url":null,"abstract":"In the post-Moore era, systems and devices with new architectures will arrive at a rapid rate with significant impacts on the software stack. Applications will not be able to fully benefit from new architectures unless they can delegate adapting to new devices in lower layers of the stack. In this paper we introduce physical design management which deals with the problem of identifying and executing transformations on physical designs of stored data, i.e. how data is mapped to storage abstractions like files, objects, or blocks, in order to improve performance. Physical design is traditionally placed with applications, access libraries, and databases, using hard-wired assumptions about underlying storage systems. Yet, storage systems increasingly not only contain multiple kinds of storage devices with vastly different performance profiles but also move data among those storage devices, thereby changing the benefit of a particular physical design. We advocate placing physical design management in storage, identify interesting research challenges, provide a brief description of a prototype implementation in Ceph, and discuss the results of initial experiments at scale that are replicable using Cloudlab. These experiments show performance and resource utilization trade-offs associated with choosing different physical designs and choosing to transform between physical designs.","PeriodicalId":130430,"journal":{"name":"2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116931338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Applying Machine Learning to Understand Write Performance of Large-scale Parallel Filesystems 应用机器学习理解大规模并行文件系统的写性能
2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW) Pub Date : 2019-11-01 DOI: 10.1109/PDSW49588.2019.00008
Bing Xie, Zilong Tan, P. Carns, J. Chase, K. Harms, J. Lofstead, S. Oral, Sudharshan S. Vazhkudai, Feiyi Wang
{"title":"Applying Machine Learning to Understand Write Performance of Large-scale Parallel Filesystems","authors":"Bing Xie, Zilong Tan, P. Carns, J. Chase, K. Harms, J. Lofstead, S. Oral, Sudharshan S. Vazhkudai, Feiyi Wang","doi":"10.1109/PDSW49588.2019.00008","DOIUrl":"https://doi.org/10.1109/PDSW49588.2019.00008","url":null,"abstract":"In high-performance computing (HPC), I/O performance prediction offers the potential to improve the efficiency of scientific computing. In particular, accurate prediction can make runtime estimates more precise, guide users toward optimal checkpoint strategies, and better inform facility provisioning and scheduling policies. HPC I/O performance is notoriously difficult to predict and model, however, in large part because of inherent variability and a lack of transparency in the behaviors of constituent storage system components. In this work we seek to advance the state of the art in HPC I/O performance prediction by (1) modeling the mean performance to address high variability, (2) deriving model features from write patterns, system architecture and system configurations, and (3) employing Lasso regression model to improve model accuracy. We demonstrate the efficacy of our approach by applying it to a crucial subset of common HPC I/O motifs, namely, file-per-process checkpoint write workloads. We conduct experiments on two distinct production HPC platforms — Titan at the Oak Ridge Leadership Computing Facility and Cetus at the Argonne Leadership Computing Facility — to train and evaluate our models. We find that we can attain ≤ 30% relative error for 92.79% and 99.64% of the samples in our test set on these platforms, respectively.","PeriodicalId":130430,"journal":{"name":"2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115154953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A Foundation for Automated Placement of Data 数据自动放置的基础
2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW) Pub Date : 2019-11-01 DOI: 10.1109/PDSW49588.2019.00010
Douglas Otstott, Ming Zhao, Latchesar Ionkov
{"title":"A Foundation for Automated Placement of Data","authors":"Douglas Otstott, Ming Zhao, Latchesar Ionkov","doi":"10.1109/PDSW49588.2019.00010","DOIUrl":"https://doi.org/10.1109/PDSW49588.2019.00010","url":null,"abstract":"With the increasing complexity of memory and storage, it is important to automate the decision of how to assign data structures to memory and storage devices. On one hand, this requires developing models to reconcile application access patterns against the limited capacity of higher-performance devices. On the other, such a modeling task demands a set of primitives to build from, and a toolkit that implements those primitives in a robust, dynamic fashion. We focus on the latter problem, and to that end we present an interface that abstracts the physical layout of data from the application developer. This will allow developers focused on optimized data place- ment to use our abstracta as the basis for their implementation, while application developers will see a unified, scalable, and resilient memory environment.","PeriodicalId":130430,"journal":{"name":"2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126279726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Profiling Platform Storage Using IO500 and Mistral 使用IO500和Mistral分析平台存储
2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW) Pub Date : 2019-11-01 DOI: 10.1109/PDSW49588.2019.00011
Nolan D. Monnier, J. Lofstead, Margaret Lawson, M. Curry
{"title":"Profiling Platform Storage Using IO500 and Mistral","authors":"Nolan D. Monnier, J. Lofstead, Margaret Lawson, M. Curry","doi":"10.1109/PDSW49588.2019.00011","DOIUrl":"https://doi.org/10.1109/PDSW49588.2019.00011","url":null,"abstract":"This paper explores how we used IO500 and the Mistral tool from Ellexus to observe detailed performance characteristics to inform tuning IO performance on Astra, a ARM-based Sandia machine with an all flash, Lustre-based storage array. Through this case study, we demonstrate that IO500 serves as a meaningful storage benchmark, even for all flash storage. We also demonstrate that using fine-grained profiling tools, such as Mistral, is essential for revealing tuning requirement details. Overall, this paper demonstrates the value of a broad spectrum benchmark, like IO500, together with a fine grained performance analysis tool, such as Mistral, for understanding detailed storage system performance for better informed tuning.","PeriodicalId":130430,"journal":{"name":"2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122670139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Enabling Transparent Asynchronous I/O using Background Threads 使用后台线程启用透明异步I/O
2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW) Pub Date : 2019-11-01 DOI: 10.1109/PDSW49588.2019.00006
Houjun Tang, Q. Koziol, S. Byna, J. Mainzer, Tonglin Li
{"title":"Enabling Transparent Asynchronous I/O using Background Threads","authors":"Houjun Tang, Q. Koziol, S. Byna, J. Mainzer, Tonglin Li","doi":"10.1109/PDSW49588.2019.00006","DOIUrl":"https://doi.org/10.1109/PDSW49588.2019.00006","url":null,"abstract":"With scientific applications moving toward exascale levels, an increasing amount of data is being produced and analyzed. Providing efficient data access is crucial to the productivity of the scientific discovery process. Compared to improvements in CPU and network speeds, I/O performance lags far behind, such that moving data across the storage hierarchy can take longer than data generation or analysis. To alleviate this I/O bottleneck, asynchronous read and write operations have been provided by the POSIX and MPI-I/O interfaces and can overlap I/O operations with computation, and thus hide I/O latency. However, these standards lack support for non-data operations such as file open, stat, and close, and their read and write operations require users to both manually manage data dependencies and use low-level byte offsets. This requires significant effort and expertise for applications to utilize. To overcome these issues, we present an asynchronous I/O framework that provides support for all I/O operations and manages data dependencies transparently and automatically. Our prototype asynchronous I/O implementation as an HDF5 VOL connector demonstrates the effectiveness of hiding the I/O cost from the application with low overhead and easy-to-use programming interface.","PeriodicalId":130430,"journal":{"name":"2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122920793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Active Learning-based Automatic Tuning and Prediction of Parallel I/O Performance 基于主动学习的并行I/O性能自动调优与预测
2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW) Pub Date : 2019-11-01 DOI: 10.1109/PDSW49588.2019.00007
Megha Agarwal, Divyansh Singhvi, Preeti Malakar, S. Byna
{"title":"Active Learning-based Automatic Tuning and Prediction of Parallel I/O Performance","authors":"Megha Agarwal, Divyansh Singhvi, Preeti Malakar, S. Byna","doi":"10.1109/PDSW49588.2019.00007","DOIUrl":"https://doi.org/10.1109/PDSW49588.2019.00007","url":null,"abstract":"Parallel I/O is an indispensable part of scientific applications. The current stack of parallel I/O contains many tunable parameters. While changing these parameters can increase I/O performance many-fold, the application developers usually resort to default values because tuning is a cumbersome process and requires expertise. We propose two auto-tuning models, based on active learning that recommend a good set of parameter values (currently tested with Lustre parameters and MPI-IO hints) for an application on a given system. These models use Bayesian optimization to find the values of parameters by minimizing an objective function. The first model runs the application to determine these values, whereas, the second model uses an I/O prediction model for the same. Thus the training time is significantly reduced in comparison to the first model (e.g., from 800 seconds to 18 seconds). Also both the models provide flexibility to focus on improvement of either read or write performance. To keep the tuning process generic, we have focused on both read and write performance. We have validated our models using an I/O benchmark (IOR) and 3 scientific application I/O kernels (S3D-IO, BT-IO and GenericIO) on two supercomputers (HPC2010 and Cori). Using the two models, we achieve an increase in I/O bandwidth of up to 11× over the default parameters. We got up to 3× improvements for 37 TB writes, corresponding to 1 billion particles in GenericIO. We also achieved up to 3.2× higher bandwidth for 4.8 TB of noncontiguous I/O in BT-IO benchmark.","PeriodicalId":130430,"journal":{"name":"2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122875620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
In Search of a Fast and Efficient Serverless DAG Engine 寻找快速高效的无服务器DAG引擎
2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW) Pub Date : 2019-10-14 DOI: 10.1109/PDSW49588.2019.00005
Benjamin Carver, Jingyuan Zhang, Ao Wang, Yue Cheng
{"title":"In Search of a Fast and Efficient Serverless DAG Engine","authors":"Benjamin Carver, Jingyuan Zhang, Ao Wang, Yue Cheng","doi":"10.1109/PDSW49588.2019.00005","DOIUrl":"https://doi.org/10.1109/PDSW49588.2019.00005","url":null,"abstract":"Python-written data analytics applications can be modeled as and compiled into a directed acyclic graph (DAG) based workflow, where the nodes are fine-grained tasks and the edges are task dependencies.Such analytics workflow jobs are increasingly characterized by short, fine-grained tasks with large fan-outs. These characteristics make them well-suited for a new cloud computing model called serverless computing or Function-as-a-Service (FaaS), which has become prevalent in recent years. The auto-scaling property of serverless computing platforms accommodates short tasks and bursty workloads, while the pay-per-use billing model of serverless computing providers keeps the cost of short tasks low. In this paper, we thoroughly investigate the problem space of DAG scheduling in serverless computing. We identify and evaluate a set of techniques to make DAG schedulers serverless-aware. These techniques have been implemented in WUKONG , a serverless, DAG scheduler attuned to AWS Lambda. WUKONG provides decentralized scheduling through a combination of static and dynamic scheduling. We present the results of an empirical study in which WUKONG is applied to a range of microbenchmark and real-world DAG applications. Results demonstrate the efficacy of WUKONG in minimizing the performance overhead introduced by AWS Lambda — WUKONG achieves competitive performance compared to a serverful DAG scheduler, while improving the performance of real-world DAG jobs by as much as 4.1x at larger scale.","PeriodicalId":130430,"journal":{"name":"2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131465740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信