P. Nowoczynski, J. Sommerfield, J. Yanovich, J. R. Scott, Zhihui Zhang, Michael J. Levine
{"title":"The data supercell","authors":"P. Nowoczynski, J. Sommerfield, J. Yanovich, J. R. Scott, Zhihui Zhang, Michael J. Levine","doi":"10.1145/2335755.2335805","DOIUrl":null,"url":null,"abstract":"The Data SuperCell (DSC) is a new, disk-based data archive deployed and in production at the Pittsburgh Supercomputing Center (PSC). It specifically deals with the archival demands of large data processing in an economic way. DSC incorporates PSCs SLASH2, layered filesystem technology, with commodity hardware and open software, to provide superior functionality, flexibility, manageability, reliability, performance and cost. Below, we describe DSC functionality goals; SLASH2 architecture, capabilities and suitability for archival applications; ZFS as an underlying file system; DSC architecture, structure and capabilities; followed by discussion of our experience with DSC, some performance measurements and plans for further development.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"34 1","pages":"13:1-13:11"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2335755.2335805","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
The Data SuperCell (DSC) is a new, disk-based data archive deployed and in production at the Pittsburgh Supercomputing Center (PSC). It specifically deals with the archival demands of large data processing in an economic way. DSC incorporates PSCs SLASH2, layered filesystem technology, with commodity hardware and open software, to provide superior functionality, flexibility, manageability, reliability, performance and cost. Below, we describe DSC functionality goals; SLASH2 architecture, capabilities and suitability for archival applications; ZFS as an underlying file system; DSC architecture, structure and capabilities; followed by discussion of our experience with DSC, some performance measurements and plans for further development.