基于集成TSK模糊分类器的众包标签质量改进

Xiongtao Zhang, Xingguang Pan, Shitong Wang
{"title":"基于集成TSK模糊分类器的众包标签质量改进","authors":"Xiongtao Zhang, Xingguang Pan, Shitong Wang","doi":"10.1109/ISKE47853.2019.9170348","DOIUrl":null,"url":null,"abstract":"At present, crowdsourcing, as a distributed solution, provides an effective and cheap solution for solving large tasks. However, due to the difference of workers’ knowledge and skill, and the existence of fraudsters, the labels quality of crowdsourcing can’t be effectively controlled and guaranteed. This paper proposes a novel label quality improvement method based on ensemble TSK fuzzy classifier with high interpretability, i.e., EW-TSK-CS. Each subclassifier TSKnoise-FC is an improved zero-order TSK fuzzy classifier which is trained by noisy label training data and is more robust. The objective function of each fuzzy sub-classifier has considered the existence of label noise, and the fuzzy subclassifier has the ability to deal with uncertain data. All the subclassifier integrated together by augmenting the original noisy-free validation data space with the output of each subclassifier in an incremental way. The augmented validation data is conducted by running the classical FCM clustering methods on the augmented validation data and using KNN to obtain the dictionary data. The label noise correction mechanism is based on the dictionary data. The experimental results on datasets Adult, chess and waveform3 show that this method can effectively improve the label quality of crowdsourcing compared with tradition label noise robustness methods, ensemble methods, and classical TSK fuzzy classifiers.","PeriodicalId":399084,"journal":{"name":"2019 IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Label Quality Improvement in Crowdsourcing with Ensemble TSK Fuzzy Classifier\",\"authors\":\"Xiongtao Zhang, Xingguang Pan, Shitong Wang\",\"doi\":\"10.1109/ISKE47853.2019.9170348\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"At present, crowdsourcing, as a distributed solution, provides an effective and cheap solution for solving large tasks. However, due to the difference of workers’ knowledge and skill, and the existence of fraudsters, the labels quality of crowdsourcing can’t be effectively controlled and guaranteed. This paper proposes a novel label quality improvement method based on ensemble TSK fuzzy classifier with high interpretability, i.e., EW-TSK-CS. Each subclassifier TSKnoise-FC is an improved zero-order TSK fuzzy classifier which is trained by noisy label training data and is more robust. The objective function of each fuzzy sub-classifier has considered the existence of label noise, and the fuzzy subclassifier has the ability to deal with uncertain data. All the subclassifier integrated together by augmenting the original noisy-free validation data space with the output of each subclassifier in an incremental way. The augmented validation data is conducted by running the classical FCM clustering methods on the augmented validation data and using KNN to obtain the dictionary data. The label noise correction mechanism is based on the dictionary data. The experimental results on datasets Adult, chess and waveform3 show that this method can effectively improve the label quality of crowdsourcing compared with tradition label noise robustness methods, ensemble methods, and classical TSK fuzzy classifiers.\",\"PeriodicalId\":399084,\"journal\":{\"name\":\"2019 IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISKE47853.2019.9170348\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISKE47853.2019.9170348","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目前,众包作为一种分布式解决方案,为解决大型任务提供了一种有效且廉价的解决方案。然而,由于工人的知识和技能的差异,以及欺诈者的存在,众包的标签质量无法得到有效的控制和保证。本文提出了一种新的基于高可解释性的集成TSK模糊分类器的标签质量改进方法,即EW-TSK-CS。每个子分类器tsknose - fc是一种改进的零阶TSK模糊分类器,该分类器采用带噪声标签训练数据进行训练,鲁棒性更强。每个模糊子分类器的目标函数都考虑了标签噪声的存在,模糊子分类器具有处理不确定数据的能力。所有子分类器以增量方式增加原始的无噪声验证数据空间和每个子分类器的输出,从而集成在一起。扩充验证数据是通过对扩充验证数据运行经典的FCM聚类方法,并使用KNN获得字典数据来实现的。标签噪声校正机制是基于字典数据的。在Adult、chess和waveform3数据集上的实验结果表明,与传统的标签噪声鲁棒性方法、集成方法和经典的TSK模糊分类器相比,该方法可以有效地提高众包的标签质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Label Quality Improvement in Crowdsourcing with Ensemble TSK Fuzzy Classifier
At present, crowdsourcing, as a distributed solution, provides an effective and cheap solution for solving large tasks. However, due to the difference of workers’ knowledge and skill, and the existence of fraudsters, the labels quality of crowdsourcing can’t be effectively controlled and guaranteed. This paper proposes a novel label quality improvement method based on ensemble TSK fuzzy classifier with high interpretability, i.e., EW-TSK-CS. Each subclassifier TSKnoise-FC is an improved zero-order TSK fuzzy classifier which is trained by noisy label training data and is more robust. The objective function of each fuzzy sub-classifier has considered the existence of label noise, and the fuzzy subclassifier has the ability to deal with uncertain data. All the subclassifier integrated together by augmenting the original noisy-free validation data space with the output of each subclassifier in an incremental way. The augmented validation data is conducted by running the classical FCM clustering methods on the augmented validation data and using KNN to obtain the dictionary data. The label noise correction mechanism is based on the dictionary data. The experimental results on datasets Adult, chess and waveform3 show that this method can effectively improve the label quality of crowdsourcing compared with tradition label noise robustness methods, ensemble methods, and classical TSK fuzzy classifiers.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信