In Situ Storage Layout Optimization for AMR Spatio-temporal Read Accesses

2016 45th International Conference on Parallel Processing (ICPP) Pub Date : 2016-08-01 DOI:10.1109/ICPP.2016.53

Houjun Tang, S. Byna, Steve Harenberg, Wenzhao Zhang, Xiaocheng Zou, Daniel F. Martin, Bin Dong, D. Devendran, Kesheng Wu, D. Trebotich, S. Klasky, N. Samatova

{"title":"In Situ Storage Layout Optimization for AMR Spatio-temporal Read Accesses","authors":"Houjun Tang, S. Byna, Steve Harenberg, Wenzhao Zhang, Xiaocheng Zou, Daniel F. Martin, Bin Dong, D. Devendran, Kesheng Wu, D. Trebotich, S. Klasky, N. Samatova","doi":"10.1109/ICPP.2016.53","DOIUrl":null,"url":null,"abstract":"Analyses of large simulation data often concentrate on regions in space and in time that contain important information. As simulations adopt Adaptive Mesh Refinement (AMR), the data records from a region of interest could be widely scattered on storage devices and accessing interesting regions results in significantly reduced I/O performance. In this work, we study the organization of block-structured AMR data on storage to improve performance of spatio-temporal data accesses. AMR has a complex hierarchical multi-resolution data structure that does not fit easily with the existing approaches that focus on uniform mesh data. To enable efficient AMR read accesses, we develop an in situ data layout optimization framework. Our framework automatically selects from a set of candidate layouts based on a performance model, and reorganizes the data before writing to storage. We evaluate this framework with three AMR datasets and access patterns derived from scientific applications. Our performance model is able to identify the best layout scheme and yields up to a 3X read performance improvement compared to the original layout. Though it is not possible to turn all read accesses into contiguous reads, we are able to achieve 90% of contiguous read throughput with the optimized layouts on average.","PeriodicalId":409991,"journal":{"name":"2016 45th International Conference on Parallel Processing (ICPP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 45th International Conference on Parallel Processing (ICPP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2016.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Analyses of large simulation data often concentrate on regions in space and in time that contain important information. As simulations adopt Adaptive Mesh Refinement (AMR), the data records from a region of interest could be widely scattered on storage devices and accessing interesting regions results in significantly reduced I/O performance. In this work, we study the organization of block-structured AMR data on storage to improve performance of spatio-temporal data accesses. AMR has a complex hierarchical multi-resolution data structure that does not fit easily with the existing approaches that focus on uniform mesh data. To enable efficient AMR read accesses, we develop an in situ data layout optimization framework. Our framework automatically selects from a set of candidate layouts based on a performance model, and reorganizes the data before writing to storage. We evaluate this framework with three AMR datasets and access patterns derived from scientific applications. Our performance model is able to identify the best layout scheme and yields up to a 3X read performance improvement compared to the original layout. Though it is not possible to turn all read accesses into contiguous reads, we are able to achieve 90% of contiguous read throughput with the optimized layouts on average.

查看原文本刊更多论文

AMR时空读访问的原位存储布局优化

对大型模拟数据的分析往往集中在空间和时间上包含重要信息的区域。由于模拟采用自适应网格细化(AMR)，来自感兴趣区域的数据记录可能会广泛分散在存储设备上，访问感兴趣的区域会导致I/O性能显著降低。在这项工作中，我们研究了块结构AMR数据在存储上的组织，以提高时空数据访问的性能。AMR具有复杂的分层多分辨率数据结构，与现有的专注于统一网格数据的方法不太适合。为了实现高效的AMR读访问，我们开发了一个原位数据布局优化框架。我们的框架根据性能模型自动从一组候选布局中进行选择，并在写入存储之前重新组织数据。我们用来自科学应用的三个AMR数据集和访问模式来评估这个框架。我们的性能模型能够识别最佳布局方案，与原始布局相比，读取性能提高了3倍。虽然不可能将所有的读访问转换为连续读，但我们可以通过优化的布局平均实现90%的连续读吞吐量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 45th International Conference on Parallel Processing (ICPP)

自引率

0.00%

发文量