Kunal Lillaney, D. Kleissas, Alexander Eusman, E. Perlman, William R. Gray Roncal, J. Vogelstein, R. Burns
{"title":"Building NDStore Through Hierarchical Storage Management and Microservice Processing","authors":"Kunal Lillaney, D. Kleissas, Alexander Eusman, E. Perlman, William R. Gray Roncal, J. Vogelstein, R. Burns","doi":"10.1109/eScience.2018.00037","DOIUrl":null,"url":null,"abstract":"We describe NDStore, a scalable multi-hierarchical data storage deployment for spatial analysis of neuroscience data on the AWS cloud. The system design is inspired by the requirement to maintain high I/O throughput for workloads that build neural connectivity maps of the brain from peta-scale imaging data using computer vision algorithms. We store all our data on the AWS object store S3 to limit our deployment costs. S3 serves as our base-tier of storage. Redis, an in-memory key-value engine, is used as our caching tier. The data is dynamically moved between the different storage tiers based on user access. All programming interfaces to this system are RESTful web-services. We include a performance evaluation that shows that our production system provides good performance for a variety of workloads by combining the assets of multiple cloud services.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"70 1","pages":"223-233"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 14th International Conference on e-Science (e-Science)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/eScience.2018.00037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
We describe NDStore, a scalable multi-hierarchical data storage deployment for spatial analysis of neuroscience data on the AWS cloud. The system design is inspired by the requirement to maintain high I/O throughput for workloads that build neural connectivity maps of the brain from peta-scale imaging data using computer vision algorithms. We store all our data on the AWS object store S3 to limit our deployment costs. S3 serves as our base-tier of storage. Redis, an in-memory key-value engine, is used as our caching tier. The data is dynamically moved between the different storage tiers based on user access. All programming interfaces to this system are RESTful web-services. We include a performance evaluation that shows that our production system provides good performance for a variety of workloads by combining the assets of multiple cloud services.