Three cobblers worth the mastermind? The potential of ensemble in crowdsourced classification problems

IF 2.8 4区管理学 Q2 MANAGEMENT

DECISION SCIENCES Pub Date : 2021-04-08 DOI:10.1111/deci.12516

Wangcheng Yan, Paolo Letizia, Wenjun Zhou

{"title":"Three cobblers worth the mastermind? The potential of ensemble in crowdsourced classification problems","authors":"Wangcheng Yan, Paolo Letizia, Wenjun Zhou","doi":"10.1111/deci.12516","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Classification problems, where the objective is to identify the class labels of given data points, are most often the subject of open contests, in which solvers compete for awards offered by solution seekers. Extant literature in open contests has studied both the winner-takes-all and top-K award schemes, in which the award is granted to the best one and the best K solutions, respectively. However, in comparing these two schemes, researchers have never considered that under a top-K award scheme, seekers may often benefit from aggregating the solutions as an ensemble, which could achieve a performance that is superior to any individual solution. The practice of ensemble is very common especially in classification and other data science problems, and in this work, we are the first to model it with the scope of deriving the optimal award scheme in open contests. Our results formally show that, given the option of ensemble, a top-K award scheme may have the advantage to grant a higher profit to the seeker than the winner-takes-all award scheme. Further, the optimal number of awards is positively affected by contest parameters including the solvers' expertise, the number of solvers, and the return on effort, whereas it is negatively affected by the technical uncertainty when such uncertainty is sufficiently large. These findings are consistent with the increasing adoption of the top-K award scheme in contests held on Kaggle and other similar platforms.</p>\n </div>","PeriodicalId":48256,"journal":{"name":"DECISION SCIENCES","volume":"53 2","pages":"223-238"},"PeriodicalIF":2.8000,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1111/deci.12516","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"DECISION SCIENCES","FirstCategoryId":"91","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/deci.12516","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MANAGEMENT","Score":null,"Total":0}

引用次数: 1

Abstract

Classification problems, where the objective is to identify the class labels of given data points, are most often the subject of open contests, in which solvers compete for awards offered by solution seekers. Extant literature in open contests has studied both the winner-takes-all and top-K award schemes, in which the award is granted to the best one and the best K solutions, respectively. However, in comparing these two schemes, researchers have never considered that under a top-K award scheme, seekers may often benefit from aggregating the solutions as an ensemble, which could achieve a performance that is superior to any individual solution. The practice of ensemble is very common especially in classification and other data science problems, and in this work, we are the first to model it with the scope of deriving the optimal award scheme in open contests. Our results formally show that, given the option of ensemble, a top-K award scheme may have the advantage to grant a higher profit to the seeker than the winner-takes-all award scheme. Further, the optimal number of awards is positively affected by contest parameters including the solvers' expertise, the number of solvers, and the return on effort, whereas it is negatively affected by the technical uncertainty when such uncertainty is sufficiently large. These findings are consistent with the increasing adoption of the top-K award scheme in contests held on Kaggle and other similar platforms.

查看原文本刊更多论文

三个皮匠抵得上一个主谋?集成在众包分类问题中的潜力

分类问题的目标是确定给定数据点的类别标签，这通常是公开竞赛的主题，在竞赛中，求解者争夺求解者提供的奖励。现有的公开竞赛文献研究了赢家通吃和top-K奖励方案，其中奖励分别授予最佳解决方案和最佳K个解决方案。然而，在比较这两种方案时，研究人员从未考虑过在top-K奖励方案下，寻求者可能经常从将解决方案聚合为一个整体中受益，这可能会获得优于任何单个解决方案的性能。集成的实践非常普遍，特别是在分类和其他数据科学问题中，在这项工作中，我们是第一个在公开竞赛中推导最佳奖励方案的范围内对其进行建模的人。我们的结果正式表明，在给定合奏选项的情况下，top-K奖励方案可能比赢者通吃的奖励方案具有给予寻求者更高利润的优势。此外，最优获奖数量受竞赛参数(包括求解者的专业知识、求解者的数量和努力回报)的积极影响，而当技术不确定性足够大时，最优获奖数量受技术不确定性的消极影响。这些发现与在Kaggle和其他类似平台上举办的比赛中越来越多地采用top-K奖励计划是一致的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

DECISION SCIENCES MANAGEMENT-

CiteScore

12.40

自引率

1.80%

发文量

期刊介绍： Decision Sciences, a premier journal of the Decision Sciences Institute, publishes scholarly research about decision making within the boundaries of an organization, as well as decisions involving inter-firm coordination. The journal promotes research advancing decision making at the interfaces of business functions and organizational boundaries. The journal also seeks articles extending established lines of work assuming the results of the research have the potential to substantially impact either decision making theory or industry practice. Ground-breaking research articles that enhance managerial understanding of decision making processes and stimulate further research in multi-disciplinary domains are particularly encouraged.