运行时大规模计算模拟中的误差测量

M. N. Dinh, Q. M. Nguyen
{"title":"运行时大规模计算模拟中的误差测量","authors":"M. N. Dinh, Q. M. Nguyen","doi":"10.1109/RIVF48685.2020.9140785","DOIUrl":null,"url":null,"abstract":"Verification of simulation codes often involves comparing the simulation output behavior to a known model using graphical displays or statistical tests. Such process is challenging for large-scale scientific codes at runtime because they often involve thousands of processes, and generate very large data structures. In our earlier work, we proposed a statistical framework for testing the correctness of large-scale applications using their runtime data. This paper studies the concept of ‘distribution distance’ and establishes the requirements in measuring the runtime differences between a verified stochastic simulation system and its larger-scale counterpart. The paper discusses two types of distribution distance including the χ2 distance and the histogram distance. We prototype the verification methodology and evaluate its performance on two production simulation programs. All experiments were conducted on a 20,000-core Cray XE6.","PeriodicalId":169999,"journal":{"name":"2020 RIVF International Conference on Computing and Communication Technologies (RIVF)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Measurements of errors in large-scale computational simulations at runtime\",\"authors\":\"M. N. Dinh, Q. M. Nguyen\",\"doi\":\"10.1109/RIVF48685.2020.9140785\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Verification of simulation codes often involves comparing the simulation output behavior to a known model using graphical displays or statistical tests. Such process is challenging for large-scale scientific codes at runtime because they often involve thousands of processes, and generate very large data structures. In our earlier work, we proposed a statistical framework for testing the correctness of large-scale applications using their runtime data. This paper studies the concept of ‘distribution distance’ and establishes the requirements in measuring the runtime differences between a verified stochastic simulation system and its larger-scale counterpart. The paper discusses two types of distribution distance including the χ2 distance and the histogram distance. We prototype the verification methodology and evaluate its performance on two production simulation programs. All experiments were conducted on a 20,000-core Cray XE6.\",\"PeriodicalId\":169999,\"journal\":{\"name\":\"2020 RIVF International Conference on Computing and Communication Technologies (RIVF)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 RIVF International Conference on Computing and Communication Technologies (RIVF)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RIVF48685.2020.9140785\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 RIVF International Conference on Computing and Communication Technologies (RIVF)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RIVF48685.2020.9140785","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

仿真代码的验证通常涉及使用图形显示或统计测试将仿真输出行为与已知模型进行比较。这种过程对于运行时的大规模科学代码来说是具有挑战性的,因为它们通常涉及数千个过程,并生成非常大的数据结构。在我们早期的工作中,我们提出了一个统计框架,用于使用运行时数据测试大规模应用程序的正确性。本文研究了“分布距离”的概念,并建立了测量经过验证的随机模拟系统与大规模随机模拟系统之间运行时间差异的要求。本文讨论了两种分布距离,即χ2距离和直方图距离。我们对验证方法进行了原型设计,并在两个生产仿真程序上对其性能进行了评估。所有的实验都是在一台两万核的克雷XE6上进行的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Measurements of errors in large-scale computational simulations at runtime
Verification of simulation codes often involves comparing the simulation output behavior to a known model using graphical displays or statistical tests. Such process is challenging for large-scale scientific codes at runtime because they often involve thousands of processes, and generate very large data structures. In our earlier work, we proposed a statistical framework for testing the correctness of large-scale applications using their runtime data. This paper studies the concept of ‘distribution distance’ and establishes the requirements in measuring the runtime differences between a verified stochastic simulation system and its larger-scale counterpart. The paper discusses two types of distribution distance including the χ2 distance and the histogram distance. We prototype the verification methodology and evaluate its performance on two production simulation programs. All experiments were conducted on a 20,000-core Cray XE6.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信