How to make your results reproducible with UCR-star and spider

Tomal Majumder, A. Eldawy
{"title":"How to make your results reproducible with UCR-star and spider","authors":"Tomal Majumder, A. Eldawy","doi":"10.1145/3557994.3565975","DOIUrl":null,"url":null,"abstract":"With the rise of data science, there has been a sharp increase in data-driven techniques that rely on both real and synthetic data. At the same time, there is a growing interest from the scientific community in the reproducibility of results. Some conferences include this explicitly in their review forms or give special badges to reproducible papers. This tutorial describes two systems that facilitate the design of reproducible experiments on both real and synthetic data. UCR-Star is an interactive repository that hosts terabytes of open geospatial data. In addition to the ability to explore and visualize this data, UCR-Star makes it easy to share all or parts of these datasets in many standard formats ensuring that other researchers can get the same exact data mentioned in the paper. Spider is a spatial data generator that generates standardized spatial datasets with full control over the data characteristics which further promotes the reproducibility of results. This tutorial will be organized into two parts. The first part will exhibit the key features of UCR-star and Spider where participants can get hands-on experience in interacting with real spatial datasets, generating synthetic data with varying distributions, and downloading them to a local machine or a remote server. The second part will explore the integration of both UCR-Star and Spider into existing systems such as QGIS and Apache AsterixDB.","PeriodicalId":299822,"journal":{"name":"Proceedings of the 4th ACM SIGSPATIAL International Workshop on APIs and Libraries for Geospatial Data Science","volume":"180 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th ACM SIGSPATIAL International Workshop on APIs and Libraries for Geospatial Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3557994.3565975","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With the rise of data science, there has been a sharp increase in data-driven techniques that rely on both real and synthetic data. At the same time, there is a growing interest from the scientific community in the reproducibility of results. Some conferences include this explicitly in their review forms or give special badges to reproducible papers. This tutorial describes two systems that facilitate the design of reproducible experiments on both real and synthetic data. UCR-Star is an interactive repository that hosts terabytes of open geospatial data. In addition to the ability to explore and visualize this data, UCR-Star makes it easy to share all or parts of these datasets in many standard formats ensuring that other researchers can get the same exact data mentioned in the paper. Spider is a spatial data generator that generates standardized spatial datasets with full control over the data characteristics which further promotes the reproducibility of results. This tutorial will be organized into two parts. The first part will exhibit the key features of UCR-star and Spider where participants can get hands-on experience in interacting with real spatial datasets, generating synthetic data with varying distributions, and downloading them to a local machine or a remote server. The second part will explore the integration of both UCR-Star and Spider into existing systems such as QGIS and Apache AsterixDB.
如何使你的结果在UCR-star和spider中重现
随着数据科学的兴起,依赖于真实数据和合成数据的数据驱动技术急剧增加。与此同时,科学界对结果的可重复性越来越感兴趣。一些会议在评审表格中明确地包括了这一点,或者给可重复的论文颁发了特殊的徽章。本教程介绍了两个系统,它们有助于在真实数据和合成数据上设计可重复的实验。UCR-Star是一个交互式存储库,承载着数tb的开放地理空间数据。除了探索和可视化这些数据的能力之外,UCR-Star还可以很容易地以许多标准格式共享这些数据集的全部或部分,确保其他研究人员可以获得论文中提到的完全相同的数据。Spider是一个空间数据生成器,它生成标准化的空间数据集,完全控制数据特征,进一步提高了结果的可重复性。本教程将分为两个部分。第一部分将展示UCR-star和Spider的主要功能,参与者可以获得与真实空间数据集交互的实践经验,生成具有不同分布的合成数据,并将其下载到本地机器或远程服务器。第二部分将探讨UCR-Star和Spider集成到现有系统中,如QGIS和Apache AsterixDB。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信