The Second Competition on Spatial Statistics for Large Datasets

Sameh Abdulah, Faten S. Alamri, Pratik Nag, Ying Sun, H. Ltaief, D. Keyes, M. Genton
{"title":"The Second Competition on Spatial Statistics for Large Datasets","authors":"Sameh Abdulah, Faten S. Alamri, Pratik Nag, Ying Sun, H. Ltaief, D. Keyes, M. Genton","doi":"10.6339/22-jds1076","DOIUrl":null,"url":null,"abstract":"In the last few decades, the size of spatial and spatio-temporal datasets in many research areas has rapidly increased with the development of data collection technologies. As a result, classical statistical methods in spatial statistics are facing computational challenges. For example, the kriging predictor in geostatistics becomes prohibitive on traditional hardware architectures for large datasets as it requires high computing power and memory footprint when dealing with large dense matrix operations. Over the years, various approximation methods have been proposed to address such computational issues, however, the community lacks a holistic process to assess their approximation efficiency. To provide a fair assessment, in 2021, we organized the first competition on spatial statistics for large datasets, generated by our ExaGeoStat software, and asked participants to report the results of estimation and prediction. Thanks to its widely acknowledged success and at the request of many participants, we organized the second competition in 2022 focusing on predictions for more complex spatial and spatio-temporal processes, including univariate nonstationary spatial processes, univariate stationary space-time processes, and bivariate stationary spatial processes. In this paper, we describe in detail the data generation procedure and make the valuable datasets publicly available for a wider adoption. Then, we review the submitted methods from fourteen teams worldwide, analyze the competition outcomes, and assess the performance of each team.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of data science : JDS","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.6339/22-jds1076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

In the last few decades, the size of spatial and spatio-temporal datasets in many research areas has rapidly increased with the development of data collection technologies. As a result, classical statistical methods in spatial statistics are facing computational challenges. For example, the kriging predictor in geostatistics becomes prohibitive on traditional hardware architectures for large datasets as it requires high computing power and memory footprint when dealing with large dense matrix operations. Over the years, various approximation methods have been proposed to address such computational issues, however, the community lacks a holistic process to assess their approximation efficiency. To provide a fair assessment, in 2021, we organized the first competition on spatial statistics for large datasets, generated by our ExaGeoStat software, and asked participants to report the results of estimation and prediction. Thanks to its widely acknowledged success and at the request of many participants, we organized the second competition in 2022 focusing on predictions for more complex spatial and spatio-temporal processes, including univariate nonstationary spatial processes, univariate stationary space-time processes, and bivariate stationary spatial processes. In this paper, we describe in detail the data generation procedure and make the valuable datasets publicly available for a wider adoption. Then, we review the submitted methods from fourteen teams worldwide, analyze the competition outcomes, and assess the performance of each team.
第二届大型数据集空间统计竞赛
在过去的几十年里,随着数据收集技术的发展,许多研究领域的空间和时空数据集的规模迅速增加。因此,空间统计学中的经典统计方法面临着计算挑战。例如,地质统计学中的克里格预测器在大型数据集的传统硬件架构上变得令人望而却步,因为它在处理大型密集矩阵运算时需要高计算能力和内存占用。多年来,已经提出了各种近似方法来解决这些计算问题,然而,社区缺乏评估其近似效率的整体过程。为了提供公平的评估,2021年,我们组织了第一次大型数据集空间统计竞赛,由我们的ExaGeoStat软件生成,并要求参与者报告估计和预测结果。由于其获得了广泛认可的成功,并应许多参与者的要求,我们在2022年组织了第二次比赛,重点是对更复杂的空间和时空过程的预测,包括单变量非平稳空间过程、单变量平稳时空过程和双变量平稳空间过程。在本文中,我们详细描述了数据生成过程,并将有价值的数据集公开以供更广泛的采用。然后,我们审查了来自全球14支球队的提交方法,分析了比赛结果,并评估了每支球队的表现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信