In Situ Storage Layout Optimization for AMR Spatio-temporal Read Accesses

Houjun Tang, S. Byna, Steve Harenberg, Wenzhao Zhang, Xiaocheng Zou, Daniel F. Martin, Bin Dong, D. Devendran, Kesheng Wu, D. Trebotich, S. Klasky, N. Samatova
{"title":"In Situ Storage Layout Optimization for AMR Spatio-temporal Read Accesses","authors":"Houjun Tang, S. Byna, Steve Harenberg, Wenzhao Zhang, Xiaocheng Zou, Daniel F. Martin, Bin Dong, D. Devendran, Kesheng Wu, D. Trebotich, S. Klasky, N. Samatova","doi":"10.1109/ICPP.2016.53","DOIUrl":null,"url":null,"abstract":"Analyses of large simulation data often concentrate on regions in space and in time that contain important information. As simulations adopt Adaptive Mesh Refinement (AMR), the data records from a region of interest could be widely scattered on storage devices and accessing interesting regions results in significantly reduced I/O performance. In this work, we study the organization of block-structured AMR data on storage to improve performance of spatio-temporal data accesses. AMR has a complex hierarchical multi-resolution data structure that does not fit easily with the existing approaches that focus on uniform mesh data. To enable efficient AMR read accesses, we develop an in situ data layout optimization framework. Our framework automatically selects from a set of candidate layouts based on a performance model, and reorganizes the data before writing to storage. We evaluate this framework with three AMR datasets and access patterns derived from scientific applications. Our performance model is able to identify the best layout scheme and yields up to a 3X read performance improvement compared to the original layout. Though it is not possible to turn all read accesses into contiguous reads, we are able to achieve 90% of contiguous read throughput with the optimized layouts on average.","PeriodicalId":409991,"journal":{"name":"2016 45th International Conference on Parallel Processing (ICPP)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 45th International Conference on Parallel Processing (ICPP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2016.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Analyses of large simulation data often concentrate on regions in space and in time that contain important information. As simulations adopt Adaptive Mesh Refinement (AMR), the data records from a region of interest could be widely scattered on storage devices and accessing interesting regions results in significantly reduced I/O performance. In this work, we study the organization of block-structured AMR data on storage to improve performance of spatio-temporal data accesses. AMR has a complex hierarchical multi-resolution data structure that does not fit easily with the existing approaches that focus on uniform mesh data. To enable efficient AMR read accesses, we develop an in situ data layout optimization framework. Our framework automatically selects from a set of candidate layouts based on a performance model, and reorganizes the data before writing to storage. We evaluate this framework with three AMR datasets and access patterns derived from scientific applications. Our performance model is able to identify the best layout scheme and yields up to a 3X read performance improvement compared to the original layout. Though it is not possible to turn all read accesses into contiguous reads, we are able to achieve 90% of contiguous read throughput with the optimized layouts on average.
AMR时空读访问的原位存储布局优化
对大型模拟数据的分析往往集中在空间和时间上包含重要信息的区域。由于模拟采用自适应网格细化(AMR),来自感兴趣区域的数据记录可能会广泛分散在存储设备上,访问感兴趣的区域会导致I/O性能显著降低。在这项工作中,我们研究了块结构AMR数据在存储上的组织,以提高时空数据访问的性能。AMR具有复杂的分层多分辨率数据结构,与现有的专注于统一网格数据的方法不太适合。为了实现高效的AMR读访问,我们开发了一个原位数据布局优化框架。我们的框架根据性能模型自动从一组候选布局中进行选择,并在写入存储之前重新组织数据。我们用来自科学应用的三个AMR数据集和访问模式来评估这个框架。我们的性能模型能够识别最佳布局方案,与原始布局相比,读取性能提高了3倍。虽然不可能将所有的读访问转换为连续读,但我们可以通过优化的布局平均实现90%的连续读吞吐量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信