与人群一起过滤:重新审视CrowdScreen

B. Groz, Ezra Levin, I. Meilijson, T. Milo
{"title":"与人群一起过滤:重新审视CrowdScreen","authors":"B. Groz, Ezra Levin, I. Meilijson, T. Milo","doi":"10.4230/LIPIcs.ICDT.2016.12","DOIUrl":null,"url":null,"abstract":"Filtering a set of items, based on a set of properties that can be verified by humans, is a common application of CrowdSourcing. When the workers are error-prone, each item is presented to multiple users, to limit the probability of misclassification. Since the Crowd is a relatively expensive resource, minimizing the number of questions per item may naturally result in big savings. Several algorithms to address this minimization problem have been presented in the CrowdScreen framework by Parameswaran et al. However, those algorithms do not scale well and therefore cannot be used in scenarios where high accuracy is required in spite of high user error rates. The goal of this paper is thus to devise algorithms that can cope with such situations. To achieve this, we provide new theoretical insights to the problem, then use them to develop a new efficient algorithm. We also propose novel optimizations for the algorithms of CrowdScreen that improve their scalability. We complement our theoretical study by an experimental evaluation of the algorithms on a large set of synthetic parameters as well as real-life crowdsourcing scenarios, demonstrating the advantages of our solution.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Filtering With the Crowd: CrowdScreen Revisited\",\"authors\":\"B. Groz, Ezra Levin, I. Meilijson, T. Milo\",\"doi\":\"10.4230/LIPIcs.ICDT.2016.12\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Filtering a set of items, based on a set of properties that can be verified by humans, is a common application of CrowdSourcing. When the workers are error-prone, each item is presented to multiple users, to limit the probability of misclassification. Since the Crowd is a relatively expensive resource, minimizing the number of questions per item may naturally result in big savings. Several algorithms to address this minimization problem have been presented in the CrowdScreen framework by Parameswaran et al. However, those algorithms do not scale well and therefore cannot be used in scenarios where high accuracy is required in spite of high user error rates. The goal of this paper is thus to devise algorithms that can cope with such situations. To achieve this, we provide new theoretical insights to the problem, then use them to develop a new efficient algorithm. We also propose novel optimizations for the algorithms of CrowdScreen that improve their scalability. We complement our theoretical study by an experimental evaluation of the algorithms on a large set of synthetic parameters as well as real-life crowdsourcing scenarios, demonstrating the advantages of our solution.\",\"PeriodicalId\":90482,\"journal\":{\"name\":\"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4230/LIPIcs.ICDT.2016.12\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.ICDT.2016.12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

基于一组可以由人类验证的属性来筛选一组项目,这是众包的一个常见应用。当工作人员容易出错时,每个项目都呈现给多个用户,以限制错误分类的概率。由于Crowd是一种相对昂贵的资源,因此最小化每个条目的问题数量自然会节省大量资源。Parameswaran等人在CrowdScreen框架中提出了几种解决这个最小化问题的算法。然而,这些算法不能很好地扩展,因此不能用于要求高精度的场景,尽管用户错误率很高。因此,本文的目标是设计出能够处理这种情况的算法。为了实现这一目标,我们为问题提供了新的理论见解,然后使用它们来开发新的高效算法。我们还对CrowdScreen算法提出了新的优化,以提高其可扩展性。我们通过对大量合成参数和现实生活中的众包场景的算法进行实验评估来补充我们的理论研究,证明了我们的解决方案的优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Filtering With the Crowd: CrowdScreen Revisited
Filtering a set of items, based on a set of properties that can be verified by humans, is a common application of CrowdSourcing. When the workers are error-prone, each item is presented to multiple users, to limit the probability of misclassification. Since the Crowd is a relatively expensive resource, minimizing the number of questions per item may naturally result in big savings. Several algorithms to address this minimization problem have been presented in the CrowdScreen framework by Parameswaran et al. However, those algorithms do not scale well and therefore cannot be used in scenarios where high accuracy is required in spite of high user error rates. The goal of this paper is thus to devise algorithms that can cope with such situations. To achieve this, we provide new theoretical insights to the problem, then use them to develop a new efficient algorithm. We also propose novel optimizations for the algorithms of CrowdScreen that improve their scalability. We complement our theoretical study by an experimental evaluation of the algorithms on a large set of synthetic parameters as well as real-life crowdsourcing scenarios, demonstrating the advantages of our solution.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信