计算科学的评估即服务

F. Hopfgartner, A. Hanbury, H. Müller, Ivan Eggel, K. Balog, Torben Brodt, G. Cormack, Jimmy J. Lin, Jayashree Kalpathy-Cramer, N. Kando, Makoto P. Kato, Anastasia Krithara, Tim Gollub, Martin Potthast, E. Viegas, Simon Mercer
{"title":"计算科学的评估即服务","authors":"F. Hopfgartner, A. Hanbury, H. Müller, Ivan Eggel, K. Balog, Torben Brodt, G. Cormack, Jimmy J. Lin, Jayashree Kalpathy-Cramer, N. Kando, Makoto P. Kato, Anastasia Krithara, Tim Gollub, Martin Potthast, E. Viegas, Simon Mercer","doi":"10.1145/3239570","DOIUrl":null,"url":null,"abstract":"Evaluation in empirical computer science is essential to show progress and assess technologies developed. Several research domains such as information retrieval have long relied on systematic evaluation to measure progress: here, the Cranfield paradigm of creating shared test collections, defining search tasks, and collecting ground truth for these tasks has persisted up until now. In recent years, however, several new challenges have emerged that do not fit this paradigm very well: extremely large data sets, confidential data sets as found in the medical domain, and rapidly changing data sets as often encountered in industry. Crowdsourcing has also changed the way in which industry approaches problem-solving with companies now organizing challenges and handing out monetary awards to incentivize people to work on their challenges, particularly in the field of machine learning. This article is based on discussions at a workshop on Evaluation-as-a-Service (EaaS). EaaS is the paradigm of not providing data sets to participants and have them work on the data locally, but keeping the data central and allowing access via Application Programming Interfaces (API), Virtual Machines (VM), or other possibilities to ship executables. The objectives of this article are to summarize and compare the current approaches and consolidate the experiences of these approaches to outline the next steps of EaaS, particularly toward sustainable research infrastructures. The article summarizes several existing approaches to EaaS and analyzes their usage scenarios and also the advantages and disadvantages. The many factors influencing EaaS are summarized, and the environment in terms of motivations for the various stakeholders, from funding agencies to challenge organizers, researchers and participants, to industry interested in supplying real-world problems for which they require solutions. EaaS solves many problems of the current research environment, where data sets are often not accessible to many researchers. Executables of published tools are equally often not available making the reproducibility of results impossible. EaaS, however, creates reusable/citable data sets as well as available executables. Many challenges remain, but such a framework for research can also foster more collaboration between researchers, potentially increasing the speed of obtaining research results.","PeriodicalId":15582,"journal":{"name":"Journal of Data and Information Quality (JDIQ)","volume":"37 1","pages":"1 - 32"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":"{\"title\":\"Evaluation-as-a-Service for the Computational Sciences\",\"authors\":\"F. Hopfgartner, A. Hanbury, H. Müller, Ivan Eggel, K. Balog, Torben Brodt, G. Cormack, Jimmy J. Lin, Jayashree Kalpathy-Cramer, N. Kando, Makoto P. Kato, Anastasia Krithara, Tim Gollub, Martin Potthast, E. Viegas, Simon Mercer\",\"doi\":\"10.1145/3239570\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Evaluation in empirical computer science is essential to show progress and assess technologies developed. Several research domains such as information retrieval have long relied on systematic evaluation to measure progress: here, the Cranfield paradigm of creating shared test collections, defining search tasks, and collecting ground truth for these tasks has persisted up until now. In recent years, however, several new challenges have emerged that do not fit this paradigm very well: extremely large data sets, confidential data sets as found in the medical domain, and rapidly changing data sets as often encountered in industry. Crowdsourcing has also changed the way in which industry approaches problem-solving with companies now organizing challenges and handing out monetary awards to incentivize people to work on their challenges, particularly in the field of machine learning. This article is based on discussions at a workshop on Evaluation-as-a-Service (EaaS). EaaS is the paradigm of not providing data sets to participants and have them work on the data locally, but keeping the data central and allowing access via Application Programming Interfaces (API), Virtual Machines (VM), or other possibilities to ship executables. The objectives of this article are to summarize and compare the current approaches and consolidate the experiences of these approaches to outline the next steps of EaaS, particularly toward sustainable research infrastructures. The article summarizes several existing approaches to EaaS and analyzes their usage scenarios and also the advantages and disadvantages. The many factors influencing EaaS are summarized, and the environment in terms of motivations for the various stakeholders, from funding agencies to challenge organizers, researchers and participants, to industry interested in supplying real-world problems for which they require solutions. EaaS solves many problems of the current research environment, where data sets are often not accessible to many researchers. Executables of published tools are equally often not available making the reproducibility of results impossible. EaaS, however, creates reusable/citable data sets as well as available executables. Many challenges remain, but such a framework for research can also foster more collaboration between researchers, potentially increasing the speed of obtaining research results.\",\"PeriodicalId\":15582,\"journal\":{\"name\":\"Journal of Data and Information Quality (JDIQ)\",\"volume\":\"37 1\",\"pages\":\"1 - 32\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"21\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Data and Information Quality (JDIQ)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3239570\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Data and Information Quality (JDIQ)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3239570","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

摘要

实证计算机科学中的评估对于显示进步和评估技术发展至关重要。一些研究领域,如信息检索,长期以来一直依赖于系统评估来衡量进展:在这里,创建共享测试集合,定义搜索任务,并为这些任务收集基础事实的克兰菲尔德范式一直持续到现在。然而,近年来出现了一些不太符合这种范式的新挑战:极其庞大的数据集、医疗领域的机密数据集以及工业中经常遇到的快速变化的数据集。众包也改变了行业解决问题的方式,公司现在组织挑战,并发放金钱奖励,以激励人们应对挑战,特别是在机器学习领域。本文基于评估即服务(EaaS)研讨会上的讨论。EaaS是一种范例,它不向参与者提供数据集,让他们在本地处理数据,而是保持数据集中,并允许通过应用程序编程接口(API)、虚拟机(VM)或其他可能的方式来访问可执行文件。本文的目的是总结和比较目前的方法,并巩固这些方法的经验,以概述EaaS的下一步,特别是朝向可持续研究基础设施。本文总结了几种现有的EaaS方法,并分析了它们的使用场景和优缺点。本文总结了影响EaaS的许多因素,以及各种利益相关者(从资助机构到挑战组织者、研究人员和参与者,再到对提供需要解决方案的现实问题感兴趣的行业)的动机方面的环境。EaaS解决了当前研究环境中的许多问题,即许多研究人员通常无法访问数据集。发布工具的可执行文件同样经常不可用,这使得结果的可再现性变得不可能。然而,EaaS创建了可重用/可引用的数据集以及可用的可执行文件。许多挑战仍然存在,但是这样一个研究框架也可以促进科学家之间更多的合作,潜在地提高获得研究结果的速度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluation-as-a-Service for the Computational Sciences
Evaluation in empirical computer science is essential to show progress and assess technologies developed. Several research domains such as information retrieval have long relied on systematic evaluation to measure progress: here, the Cranfield paradigm of creating shared test collections, defining search tasks, and collecting ground truth for these tasks has persisted up until now. In recent years, however, several new challenges have emerged that do not fit this paradigm very well: extremely large data sets, confidential data sets as found in the medical domain, and rapidly changing data sets as often encountered in industry. Crowdsourcing has also changed the way in which industry approaches problem-solving with companies now organizing challenges and handing out monetary awards to incentivize people to work on their challenges, particularly in the field of machine learning. This article is based on discussions at a workshop on Evaluation-as-a-Service (EaaS). EaaS is the paradigm of not providing data sets to participants and have them work on the data locally, but keeping the data central and allowing access via Application Programming Interfaces (API), Virtual Machines (VM), or other possibilities to ship executables. The objectives of this article are to summarize and compare the current approaches and consolidate the experiences of these approaches to outline the next steps of EaaS, particularly toward sustainable research infrastructures. The article summarizes several existing approaches to EaaS and analyzes their usage scenarios and also the advantages and disadvantages. The many factors influencing EaaS are summarized, and the environment in terms of motivations for the various stakeholders, from funding agencies to challenge organizers, researchers and participants, to industry interested in supplying real-world problems for which they require solutions. EaaS solves many problems of the current research environment, where data sets are often not accessible to many researchers. Executables of published tools are equally often not available making the reproducibility of results impossible. EaaS, however, creates reusable/citable data sets as well as available executables. Many challenges remain, but such a framework for research can also foster more collaboration between researchers, potentially increasing the speed of obtaining research results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信