众包系统中有限监督下的工人过滤

Lingyu Lyu, M. Kantardzic, Hanqing Hu
{"title":"众包系统中有限监督下的工人过滤","authors":"Lingyu Lyu, M. Kantardzic, Hanqing Hu","doi":"10.1109/ICMLA.2018.00128","DOIUrl":null,"url":null,"abstract":"In order to obtain high quality labels, it is important to recognize and tackle noisy workers in crowdsourcing applications. In particular, spam workers, who randomly assign labels to items, can greatly degrade the crowdsourced label quality. As such, we propose a semi-supervised worker filtering (SWF) approach to filter this type of workers among the crowd. The SWF model recognizes spam workers by utilizing a limited set of gold truths. An optimization based truth discovery framework, which minimizes the total errors reside workers' labels, is integrated with the semi-supervised worker filtering approach (SWF-TD) to infer the true labels for unlabeled items. The efficacy of the proposed methodology is demonstrated on both synthetic and real-world datasets. The experimental analysis on real world datasets showed that by using around 40% gold truths as priori knowledge, it is possible that SWF-TD approach provides similar performance to the fully labeled worker filtering model.","PeriodicalId":6533,"journal":{"name":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"2 1","pages":"802-807"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Worker Filtering with Limited Supervision in Crowdsourcing Systems\",\"authors\":\"Lingyu Lyu, M. Kantardzic, Hanqing Hu\",\"doi\":\"10.1109/ICMLA.2018.00128\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to obtain high quality labels, it is important to recognize and tackle noisy workers in crowdsourcing applications. In particular, spam workers, who randomly assign labels to items, can greatly degrade the crowdsourced label quality. As such, we propose a semi-supervised worker filtering (SWF) approach to filter this type of workers among the crowd. The SWF model recognizes spam workers by utilizing a limited set of gold truths. An optimization based truth discovery framework, which minimizes the total errors reside workers' labels, is integrated with the semi-supervised worker filtering approach (SWF-TD) to infer the true labels for unlabeled items. The efficacy of the proposed methodology is demonstrated on both synthetic and real-world datasets. The experimental analysis on real world datasets showed that by using around 40% gold truths as priori knowledge, it is possible that SWF-TD approach provides similar performance to the fully labeled worker filtering model.\",\"PeriodicalId\":6533,\"journal\":{\"name\":\"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"2 1\",\"pages\":\"802-807\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2018.00128\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2018.00128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

为了获得高质量的标签,在众包应用中识别和解决噪音工人是很重要的。特别是垃圾邮件工作者,他们随机地给物品分配标签,这大大降低了众包标签的质量。因此,我们提出了一种半监督工人过滤(SWF)方法来过滤人群中的这类工人。SWF模型通过使用一组有限的黄金真理来识别垃圾邮件工作者。将基于优化的真相发现框架与半监督工人过滤方法(SWF-TD)相结合,以最大限度地减少工人标签的总误差,从而推断出未标记项目的真实标签。所提出的方法的有效性在合成和现实世界的数据集上得到了证明。对真实世界数据集的实验分析表明,通过使用大约40%的黄金真理作为先验知识,SWF-TD方法有可能提供与完全标记的工人过滤模型相似的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Worker Filtering with Limited Supervision in Crowdsourcing Systems
In order to obtain high quality labels, it is important to recognize and tackle noisy workers in crowdsourcing applications. In particular, spam workers, who randomly assign labels to items, can greatly degrade the crowdsourced label quality. As such, we propose a semi-supervised worker filtering (SWF) approach to filter this type of workers among the crowd. The SWF model recognizes spam workers by utilizing a limited set of gold truths. An optimization based truth discovery framework, which minimizes the total errors reside workers' labels, is integrated with the semi-supervised worker filtering approach (SWF-TD) to infer the true labels for unlabeled items. The efficacy of the proposed methodology is demonstrated on both synthetic and real-world datasets. The experimental analysis on real world datasets showed that by using around 40% gold truths as priori knowledge, it is possible that SWF-TD approach provides similar performance to the fully labeled worker filtering model.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信