Experiences using smaash to manage data-intensive simulations

IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2011-06-08 DOI:10.1145/1996130.1996158

R. Hudson, Johnny Norris, L. Reid, K. Weide, IV GeorgeCalJordan, M. Papka

{"title":"Experiences using smaash to manage data-intensive simulations","authors":"R. Hudson, Johnny Norris, L. Reid, K. Weide, IV GeorgeCalJordan, M. Papka","doi":"10.1145/1996130.1996158","DOIUrl":null,"url":null,"abstract":"High performance scientific computer simulations created with such systems as the University of Chicago's FLASH code generate enormous amounts of data that must be captured, cataloged, and analyzed. Unless this is formally done, monitoring such simulations, tracking and reproducing old ones, and analyzing and archiving their output, can be haphazard and idiosyncratic. Smaash, a simulation management and analysis system that has been developed at the University of Chicago and Argonne National Laboratory, seeks to solve some of these problems by offering what approaches a single point of control and analysis, a metadata-base, and a set of tools that automate some of what scientists have been doing by hand.\n Smaash was designed to be independent of the particular simulation code, and is accessible from many computing platforms. It is automatic and standardized, and was built using open source software tools. Data security is considered throughout the process, yet users are insulated from onerous verification procedures. Because the system was developed with feedback from scientific users, its user interface reflects how scientists work in their daily life. We describe our system and a typical simulation it was designed to support. We illustrate its utility with several examples describing our experience of freeing scientists from the data manipulation phase to focus on the computational results and the analysis of high performance computing.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1996130.1996158","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

High performance scientific computer simulations created with such systems as the University of Chicago's FLASH code generate enormous amounts of data that must be captured, cataloged, and analyzed. Unless this is formally done, monitoring such simulations, tracking and reproducing old ones, and analyzing and archiving their output, can be haphazard and idiosyncratic. Smaash, a simulation management and analysis system that has been developed at the University of Chicago and Argonne National Laboratory, seeks to solve some of these problems by offering what approaches a single point of control and analysis, a metadata-base, and a set of tools that automate some of what scientists have been doing by hand. Smaash was designed to be independent of the particular simulation code, and is accessible from many computing platforms. It is automatic and standardized, and was built using open source software tools. Data security is considered throughout the process, yet users are insulated from onerous verification procedures. Because the system was developed with feedback from scientific users, its user interface reflects how scientists work in their daily life. We describe our system and a typical simulation it was designed to support. We illustrate its utility with several examples describing our experience of freeing scientists from the data manipulation phase to focus on the computational results and the analysis of high performance computing.

查看原文本刊更多论文

使用smash管理数据密集型模拟的经验

用芝加哥大学的FLASH代码等系统创建的高性能科学计算机模拟产生了大量的数据，这些数据必须被捕获、编目和分析。除非正式完成，否则监控此类模拟、跟踪和复制旧模拟、分析和存档其输出可能是随意和特殊的。芝加哥大学(University of Chicago)和阿贡国家实验室(Argonne National Laboratory)开发的仿真管理和分析系统Smaash试图通过提供单点控制和分析方法、元数据库和一套工具来解决其中的一些问题，这些工具可以自动完成科学家一直在手工完成的一些工作。Smaash被设计成独立于特定的仿真代码，并且可以从许多计算平台访问。它是自动化和标准化的，并且是使用开源软件工具构建的。在整个过程中都要考虑数据安全性，而用户则不需要进行繁琐的验证过程。由于该系统是根据科学用户的反馈开发的，因此其用户界面反映了科学家在日常生活中的工作方式。我们描述了我们的系统和它所支持的典型仿真。我们用几个例子来说明它的实用性，这些例子描述了我们将科学家从数据操作阶段解放出来，专注于计算结果和高性能计算分析的经验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE International Symposium on High-Performance Parallel Distributed Computing

自引率

0.00%

发文量