Michael J. Brim, D. Dillow, S. Oral, B. Settlemyer, Feiyi Wang
{"title":"Asynchronous object storage with QoS for scientific and commercial big data","authors":"Michael J. Brim, D. Dillow, S. Oral, B. Settlemyer, Feiyi Wang","doi":"10.1145/2538542.2538565","DOIUrl":"https://doi.org/10.1145/2538542.2538565","url":null,"abstract":"This paper presents our design for an asynchronous object storage system intended for use in scientific and commercial big data workloads. Use cases from the target workload domains are used to motivate the key abstractions used in the application programming interface (API). The architecture of the Scalable Object Store (SOS), a prototype object storage system that supports the API's facilities, is presented. The SOS serves as a vehicle for future research into scalable and resilient big data object storage. We briefly review our research into providing efficient storage servers capable of providing quality of service (QoS) contracts relevant for big data use cases.","PeriodicalId":250653,"journal":{"name":"Proceedings of the 8th Parallel Data Storage Workshop","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125560712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SDS: a framework for scientific data services","authors":"Bin Dong, S. Byna, Kesheng Wu","doi":"10.1145/2538542.2538563","DOIUrl":"https://doi.org/10.1145/2538542.2538563","url":null,"abstract":"Large-scale scientific applications typically write their data to parallel file systems with organizations designed to achieve fast write speeds. Analysis tasks frequently read the data in a pattern that is different from the write pattern, and therefore experience poor I/O performance. In this paper, we introduce a prototype framework for bridging the performance gap between write and read stages of data access from parallel file systems. We call this framework Scientific Data Services, or SDS for short. This initial implementation of SDS focuses on reorganizing previously written files into data layouts that benefit read patterns, and transparently directs read calls to the reorganized data. SDS follows a client-server architecture. The SDS Server manages partial or full replicas of reorganized datasets and serves SDS Clients' requests for data. The current version of the SDS client library supports HDF5 programming interface for reading data. The client library intercepts HDF5 calls using the HDF5 Virtual Object Layer (VOL) and transparently redirects them to the reorganized data. The SDS client library also provides a querying interface for reading part of the data based on user-specified selective criteria. We describe the design and implementation of the SDS client-server architecture, and evaluate the response time of the SDS Server and the performance benefits of SDS.","PeriodicalId":250653,"journal":{"name":"Proceedings of the 8th Parallel Data Storage Workshop","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122744543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 8th Parallel Data Storage Workshop","authors":"Dean Hildebrand, K. Schwan","doi":"10.1145/2538542","DOIUrl":"https://doi.org/10.1145/2538542","url":null,"abstract":"","PeriodicalId":250653,"journal":{"name":"Proceedings of the 8th Parallel Data Storage Workshop","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121498966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Structuring PLFS for extensibility","authors":"C. Cranor, Milo Polte, Garth A. Gibson","doi":"10.1145/2538542.2538564","DOIUrl":"https://doi.org/10.1145/2538542.2538564","url":null,"abstract":"The Parallel Log Structured Filesystem (PLFS) [5] was designed to transparently transform highly concurrent, massive high-performance computing (HPC) N-to-1 checkpoint workloads into N-to-N workloads to avoid single-file performance bottlenecks in typical HPC distributed filesystems. PLFS has produced speedups of 2-150X for N-1 workloads at Los Alamos National Lab. Having successfully improved N-1 performance, we have restructured PLFS for extensibility so that it can be applied to more workloads and storage systems. In this paper we describe PLFS' evolution from a single-purpose log-structured middleware filesystem into a more general platform for transparently translating application I/O patterns. As an example of this extensibility, we show how PLFS can now be used to enable HPC applications to perform N-1 checkpoints on an HDFS-based cloud storage system.","PeriodicalId":250653,"journal":{"name":"Proceedings of the 8th Parallel Data Storage Workshop","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121626379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feiyi Wang, M. Nelson, S. Oral, S. Atchley, S. Weil, B. Settlemyer, Blake Caldwell, Jason Hill
{"title":"Performance and scalability evaluation of the Ceph parallel file system","authors":"Feiyi Wang, M. Nelson, S. Oral, S. Atchley, S. Weil, B. Settlemyer, Blake Caldwell, Jason Hill","doi":"10.1145/2538542.2538562","DOIUrl":"https://doi.org/10.1145/2538542.2538562","url":null,"abstract":"Ceph is an emerging open-source parallel distributed file and storage system. By design, Ceph leverages unreliable commodity storage and network hardware, and provides reliability and fault-tolerance via controlled object placement and data replication. This paper presents our file and block I/O performance and scalability evaluation of Ceph for scientific high-performance computing (HPC) environments. Our work makes two unique contributions. First, our evaluation is performed under a realistic setup for a large-scale capability HPC environment using a commercial high-end storage system. Second, our path of investigation, tuning efforts, and findings made direct contributions to Ceph's development and improved code quality, scalability, and performance. These changes should benefit both Ceph and the HPC community at large.","PeriodicalId":250653,"journal":{"name":"Proceedings of the 8th Parallel Data Storage Workshop","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125276272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anthony Simonet, G. Fedak, M. Ripeanu, S. Al-Kiswany
{"title":"Active data: a data-centric approach to data life-cycle management","authors":"Anthony Simonet, G. Fedak, M. Ripeanu, S. Al-Kiswany","doi":"10.1145/2538542.2538566","DOIUrl":"https://doi.org/10.1145/2538542.2538566","url":null,"abstract":"Data-intensive science offers new opportunities for innovation and discoveries, provided that large datasets can be handled efficiently. Data management for data-intensive science applications is challenging; requiring support for complex data life cycles, coordination across multiple sites, fault tolerance, and scalability to support tens of sites and petabytes of data. In this paper, we argue that data management for data-intensive science applications requires a fundamentally different management approach than the current ad-hoc task centric approach. We propose Active Data, a fundamentally novel paradigm for data life cycle management. Active Data follows two principles: data-centric and event-driven. We report on the Active Data programming model and its preliminary implementation, and discuss the benefits and limitations of the approach on recognized challenging data-intensive science use-cases.","PeriodicalId":250653,"journal":{"name":"Proceedings of the 8th Parallel Data Storage Workshop","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133875670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient transactions for parallel data movement","authors":"J. Lofstead, Jai Dayal, I. Jimenez, C. Maltzahn","doi":"10.1145/2538542.2538567","DOIUrl":"https://doi.org/10.1145/2538542.2538567","url":null,"abstract":"The rise of Integrated Application Workflows (IAWs) for processing data prior to storage on persistent media prompts the need to incorporate features that reproduce many of the semantics of persistent storage devices. One such feature is the ability to manage data sets as chunks with natural barriers between different data sets. Towards that end, we need a mechanism to ensure that data moved to an intermediate storage area is both complete and correct before allowing access by other processing components. The Doubly Distributed Transactions (D2T) protocol offers such a mechanism. The initial development [9] suffered from scalability limitations and undue requirements on server processes. The current version has addressed these limitations and has demonstrated scalability with low overhead.","PeriodicalId":250653,"journal":{"name":"Proceedings of the 8th Parallel Data Storage Workshop","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123110886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Crume, C. Maltzahn, L. Ward, Thomas M. Kroeger, M. Curry, R. Oldfield
{"title":"Fourier-assisted machine learning of hard disk drive access time models","authors":"A. Crume, C. Maltzahn, L. Ward, Thomas M. Kroeger, M. Curry, R. Oldfield","doi":"10.1145/2538542.2538561","DOIUrl":"https://doi.org/10.1145/2538542.2538561","url":null,"abstract":"Predicting access times is a crucial part of predicting hard disk drive performance. Existing approaches use white-box modeling and require intimate knowledge of the internal layout of the drive, which can take months to extract. Automatically learning this behavior is a much more desirable approach, requiring less expert knowledge, fewer assumptions, and less time. Others have created behavioral models of hard disk drive performance, but none have shown low per-request errors. A barrier to machine learning of access times has been the existence of periodic behavior with high, unknown frequencies. We show how hard disk drive access times can be predicted to within 0:83 ms using a neural net after these frequencies are found using Fourier analysis.","PeriodicalId":250653,"journal":{"name":"Proceedings of the 8th Parallel Data Storage Workshop","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123571437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Costa, S. Al-Kiswany, A. Barros, Hao Yang, M. Ripeanu
{"title":"Predicting intermediate storage performance for workflow applications","authors":"L. Costa, S. Al-Kiswany, A. Barros, Hao Yang, M. Ripeanu","doi":"10.1145/2538542.2538560","DOIUrl":"https://doi.org/10.1145/2538542.2538560","url":null,"abstract":"System configuration decisions for I/O-intensive workflow applications can be complex even for expert users. Users face decisions to configure several parameters optimally (e.g., replication level, chunk size, number of storage node) - each having an impact on overall application performance. This paper presents our progress on addressing the problem of supporting storage system configuration decisions for workflow applications. Our approach accelerates the exploration of the configuration space based on a low-cost performance predictor that estimates turn-around time of a workflow application in a given setup. Our evaluation shows that the predictor is effective in identifying the desired system configuration, and it is lightweight using 2000-5000× less resources (machines × time) than running the actual benchmarks.","PeriodicalId":250653,"journal":{"name":"Proceedings of the 8th Parallel Data Storage Workshop","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129821174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}