Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems最新文献

筛选
英文 中文
Software-defined storage for fast trajectory queries using a deltaFS indexed massive directory 软件定义的存储,用于使用deltaFS索引的海量目录进行快速轨迹查询
Qing Zheng, George Amvrosiadis, Saurabh Kadekodi, Garth A. Gibson, C. Cranor, B. Settlemyer, G. Grider, Fan Guo
{"title":"Software-defined storage for fast trajectory queries using a deltaFS indexed massive directory","authors":"Qing Zheng, George Amvrosiadis, Saurabh Kadekodi, Garth A. Gibson, C. Cranor, B. Settlemyer, G. Grider, Fan Guo","doi":"10.1145/3149393.3149398","DOIUrl":"https://doi.org/10.1145/3149393.3149398","url":null,"abstract":"In this paper we introduce the Indexed Massive Directory, a new technique for indexing data within DeltaFS. With its design as a scalable, server-less file system for HPC platforms, DeltaFS scales file system metadata performance with application scale. The Indexed Massive Directory is a novel extension to the DeltaFS data plane, enabling in-situ indexing of massive amounts of data written to a single directory simultaneously, and in an arbitrarily large number of files. We achieve this through a memory-efficient indexing mechanism for reordering and indexing data, and a log-structured storage layout to pack small writes into large log objects, all while ensuring compute node resources are used frugally. We demonstrate the efficiency of this indexing mechanism through VPIC, a widely-used simulation code that scales to trillions of particles. With DeltaFS, we modify VPIC to create a file for each particle to receive writes of that particle's output data. Dynamically indexing the directory's underlying storage allows us to achieve a 5000x speedup in single particle trajectory queries, which require reading all data for a single particle. This speedup increases with application scale while the overhead is fixed at 3% of available memory.","PeriodicalId":262458,"journal":{"name":"Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130081687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Toward scalable monitoring on large-scale storage for software defined cyberinfrastructure 面向软件定义网络基础设施的大规模存储的可扩展监控
A. Paul, S. Tuecke, Ryan Chard, A. Butt, K. Chard, Ian T Foster
{"title":"Toward scalable monitoring on large-scale storage for software defined cyberinfrastructure","authors":"A. Paul, S. Tuecke, Ryan Chard, A. Butt, K. Chard, Ian T Foster","doi":"10.1145/3149393.3149402","DOIUrl":"https://doi.org/10.1145/3149393.3149402","url":null,"abstract":"As research processes become yet more collaborative and increasingly data-oriented, new techniques are needed to efficiently manage and automate the crucial, yet tedious, aspects of the data life-cycle. Researchers now spend considerable time replicating, cataloging, sharing, analyzing, and purging large amounts of data, distributed over vast storage networks. Software Defined Cyberinfrastructure (SDCI) provides a solution to this problem by enhancing existing storage systems to enable the automated execution of actions based on the specification of high-level data management policies. Our SDCI implementation, called Ripple, relies on agents being deployed on storage resources to detect and act on data events. However, current monitoring technologies, such as inotify, are not generally available on large or parallel file systems, such as Lustre. We describe here an approach for scalable, lightweight, event detection on large (multi-petabyte) Lustre file systems. Together, Ripple and the Lustre monitor enable new types of lifecycle automation across both personal devices and leadership computing platforms.","PeriodicalId":262458,"journal":{"name":"Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132495301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Taming metadata storms in parallel filesystems with metaFS 使用metaFS控制并行文件系统中的元数据风暴
Tim Shaffer, D. Thain
{"title":"Taming metadata storms in parallel filesystems with metaFS","authors":"Tim Shaffer, D. Thain","doi":"10.1145/3149393.3149401","DOIUrl":"https://doi.org/10.1145/3149393.3149401","url":null,"abstract":"Metadata performance remains a serious bottleneck in parallel filesystems. In particular, when complex applications start up on many nodes at once, a \"metadata storm\" occurs as each instance traverses the filesystem in order to search for executables, libraries, and other necessary runtime components. Not only does this delay the application in question, but it can render the entire system unusable by other clients. To address this problem, we present MetaFS, a user-level overlay filesystem that sits on top of an existing parallel filesystem. MetaFS indexes the static metadata content of complex applications and delivers it in bulk to execution nodes, where it can be cached and queried quickly, while relying on the existing parallel filesystem for data delivery. We demonstrate that MetaFS applied to a complex bioinformatics application converts the metadata load placed on a production Panasas filesystem from 1.1 million operations per task to 1.9 MB of bulk data per task, increasing the metadata scalability limit of the application from 66 nodes to 5,000 nodes.","PeriodicalId":262458,"journal":{"name":"Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131930831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Performance analysis of emerging data analytics and HPC workloads 新兴数据分析和高性能计算工作负载的性能分析
C. Daley, Prabhat, S. Dosanjh, N. Wright
{"title":"Performance analysis of emerging data analytics and HPC workloads","authors":"C. Daley, Prabhat, S. Dosanjh, N. Wright","doi":"10.1145/3149393.3149400","DOIUrl":"https://doi.org/10.1145/3149393.3149400","url":null,"abstract":"Supercomputers are increasingly being used to run a data analytics workload in addition to a traditional simulation science workload. This mixed workload must be rigorously characterized to ensure that appropriately balanced machines are deployed. In this paper we analyze a suite of applications representing the simulation science and data workload at the NERSC supercomputing center. We show how time is spent in application compute, library compute, communication and I/O, and present application performance on both the Intel Xeon and Intel Xeon-Phi partitions of the Cori supercomputer. We find commonality in the libraries used, I/O motifs and methods of parallelism, and obtain similar node-to-node performance for the base application configurations. We demonstrate that features of the Intel Xeon-Phi node architecture and a Burst Buffer can improve application performance, providing evidence that an exascale-era energy-efficient platform can support a mixed workload.","PeriodicalId":262458,"journal":{"name":"Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems","volume":"186 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129855593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Architecting HBM as a high bandwidth, high capacity, self-managed last-level cache 将HBM架构为高带宽、高容量、自我管理的最后一级缓存
Tyler Stocksdale, Mu-Tien Chang, Hongzhong Zheng, F. Mueller
{"title":"Architecting HBM as a high bandwidth, high capacity, self-managed last-level cache","authors":"Tyler Stocksdale, Mu-Tien Chang, Hongzhong Zheng, F. Mueller","doi":"10.1145/3149393.3149394","DOIUrl":"https://doi.org/10.1145/3149393.3149394","url":null,"abstract":"Due to the recent growth in the number of on-chip cores available in today's multi-core processors, there is an increased demand for memory bandwidth and capacity. However, off-chip DRAM is not scaling at the rate necessary for the growth in number of on-chip cores. Stacked DRAM last-level caches have been proposed to alleviate these bandwidth constraints, however, many of these ideas are not practical for real systems, or may not take advantage of the features available in today's stacked DRAM variants. In this paper, we design a last-level, stacked DRAM cache that is practical for real-world systems and takes advantage of High Bandwidth Memory (HBM) [1]. Our HBM cache only requires one minor change to existing memory controllers to support communication. It uses HBM's built-in logic die to handle tag storage and lookups. We also introduce novel tag/data storage that enables faster lookups, associativity, and more capacity than previous designs.","PeriodicalId":262458,"journal":{"name":"Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114416102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Optimized scatter/gather data operations for parallel storage 为并行存储优化了分散/收集数据操作
Latchesar Ionkov, C. Maltzahn, M. Lang
{"title":"Optimized scatter/gather data operations for parallel storage","authors":"Latchesar Ionkov, C. Maltzahn, M. Lang","doi":"10.1145/3149393.3149397","DOIUrl":"https://doi.org/10.1145/3149393.3149397","url":null,"abstract":"Scientific workflows contain an increasing number of interacting applications, often with big disparity between the formats of data being produced and consumed by different applications. This mismatch can result in performance degradation as data retrieval causes multiple read operations (often to a remote storage system) in order to convert the data. Although some parallel filesystems and middleware libraries attempt to identify access patterns and optimize data retrieval, they frequently fail if the patterns are complex. The goal of ASGARD is to replace I/O operations issued to a file by the processes with a single operation that passes enough semantic information to the storage system, so it can combine (and eventually optimize) the data movement. ASGARD allows application developers to define their application's abstract dataset as well as the subsets of the data (fragments) that are created and used by the HPC codes. It uses the semantic information to generate and execute transformation rules that convert the data between the the memory layouts of the producer and consumer applications, as well as the layout on nonvolatile storage. The transformation engine implements functionality similar to the scatter/gather support available in some file systems. Since data subsets are defined during the initialization phase, i.e., well in advance from the time they are used to store and retrieve data, the storage system has multiple opportunities to optimize both the data layout and the transformation rules in order to increase the overall I/O performance. In order to evaluate ASGARD's performance, we added support for ASGARD's transformation rules to Ceph's object store RADOS. We created Ceph data objects that allow custom data striping based on ASGARD's fragment definitions. Our tests with the extended RADOS show up to 5 times performance improvements for writes and 10 times performance improvements for reads over collective MPI I/O.","PeriodicalId":262458,"journal":{"name":"Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124902828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diving into petascale production file systems through large scale profiling and analysis 通过大规模剖析和分析深入到千万亿级生产文件系统
Feiyi Wang, Hyogi Sim, C. Harr, S. Oral
{"title":"Diving into petascale production file systems through large scale profiling and analysis","authors":"Feiyi Wang, Hyogi Sim, C. Harr, S. Oral","doi":"10.1145/3149393.3149399","DOIUrl":"https://doi.org/10.1145/3149393.3149399","url":null,"abstract":"As leadership computing facilities grow their storage capacity into the multi- petabyte range, the number of files and directories leap into the scale of billions. A complete profiling of such a parallel file system in a production environment presents a unique challenge. On one hand, the time, resources, and negative performance impact on production users can make regular profiling difficult. On the other hand, the result of such profiling can yield much needed understanding of the file system's general characteristics, as well as provide insight to how users write and access their data on a grand scale. This paper presents a lightweight and scalable profiling solution that can efficiently walk, analyze, and profile multi-petabyte parallel file systems. This tool has been deployed and is in regular use on very large-scale production parallel file systems at both Oak Ridge National Lab's Oak Ridge Leadership Facility (OLCF) and Lawrence Livermore National Lab's Livermore Computing (LC) facilities. We present the results of our initial analysis on the data collected from these two large-scale production systems, organized into three use cases: (1) file system snapshot and composition, (2) striping pattern analysis for Lustre, and (3) simulated storage capacity utilization in preparation for future file systems. Our analysis shows that on the OLCF file system, over 96% of user files exhibit the default stripe width, potentially limiting performance on large files by underutilizing storage servers and disks. Our simulated block analysis quantitatively shows the space overhead when doing a forklift system migration. It also reveals that due to the difference in system compositions (OLCF vs. LC), we can achieve better performance and space trade-offs by employing different native file system block sizes.","PeriodicalId":262458,"journal":{"name":"Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125054088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
CoSS: proposing a contract-based storage system for HPC CoSS:提出一种基于合约的HPC存储系统
Matthieu Dorier, Matthieu Dreher, T. Peterka, R. Ross
{"title":"CoSS: proposing a contract-based storage system for HPC","authors":"Matthieu Dorier, Matthieu Dreher, T. Peterka, R. Ross","doi":"10.1145/3149393.3149396","DOIUrl":"https://doi.org/10.1145/3149393.3149396","url":null,"abstract":"Data management is a critical component of high-performance computing, with storage as a cornerstone. Yet the traditional model of parallel file systems fails to meet users' needs, in terms of both performance and features. In this paper, we propose CoSS, a new storage model based on contracts. Contracts encapsulate in the same entity the data model (type, dimensions, units, etc.) and the intended uses of the data. They enable the storage system to work with much more knowledge about the input and output expected from an application and how it should be exposed to the user. This knowledge enables CoSS to optimize data formatting and placement to best fit user's requirements, storage space, and performance. This concept paper introduces the idea of contract-based storage systems and presents some of the opportunities it offers, in order to motivate further research in this direction.","PeriodicalId":262458,"journal":{"name":"Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123931166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems 第二届并行数据存储与数据密集可扩展计算系统国际联合研讨会论文集
K. Mohror, B. Welch
{"title":"Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems","authors":"K. Mohror, B. Welch","doi":"10.1145/3149393","DOIUrl":"https://doi.org/10.1145/3149393","url":null,"abstract":"","PeriodicalId":262458,"journal":{"name":"Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122840941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UMAMI: a recipe for generating meaningful metrics through holistic I/O performance analysis UMAMI:通过整体I/O性能分析生成有意义指标的方法
Glenn K. Lockwood, Wucherl Yoo, S. Byna, N. Wright, S. Snyder, K. Harms, Zachary Nault, P. Carns
{"title":"UMAMI: a recipe for generating meaningful metrics through holistic I/O performance analysis","authors":"Glenn K. Lockwood, Wucherl Yoo, S. Byna, N. Wright, S. Snyder, K. Harms, Zachary Nault, P. Carns","doi":"10.1145/3149393.3149395","DOIUrl":"https://doi.org/10.1145/3149393.3149395","url":null,"abstract":"I/O efficiency is essential to productivity in scientific computing, especially as many scientific domains become more data-intensive. Many characterization tools have been used to elucidate specific aspects of parallel I/O performance, but analyzing components of complex I/O subsystems in isolation fails to provide insight into critical questions: how do the I/O components interact, what are reasonable expectations for application performance, and what are the underlying causes of I/O performance problems? To address these questions while capitalizing on existing component-level characterization tools, we propose an approach that combines on-demand, modular synthesis of I/O characterization data into a unified monitoring and metrics interface (UMAMI) to provide a normalized, holistic view of I/O behavior. We evaluate the feasibility of this approach by applying it to a month-long benchmarking study on two distinct large-scale computing platforms. We present three case studies that highlight the importance of analyzing application I/O performance in context with both contemporaneous and historical component metrics, and we provide new insights into the factors affecting I/O performance. By demonstrating the generality of our approach, we lay the groundwork for a production-grade framework for holistic I/O analysis.","PeriodicalId":262458,"journal":{"name":"Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133035257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信