Zhenyu Shu, V. Sheng, Yang Zhang, Dianhong Wang, J. Zhang, Heng Chen
{"title":"整合主动学习与监督的众包泛化","authors":"Zhenyu Shu, V. Sheng, Yang Zhang, Dianhong Wang, J. Zhang, Heng Chen","doi":"10.1109/ICMLA.2015.13","DOIUrl":null,"url":null,"abstract":"With various online crowdsourcing platforms, it is easy to collect multiple labels for the same examples from the crowd. Consensus integration algorithms can infer the estimated ground truths from the multiple label sets of these crowdsourcing datasets. However, it couldn't be avoided that these integrated (estimated) labels still contain noises. In order to further improve the performance of a model learned from data with these integrated labels, we propose an active learning framework to further improve the data quality, such that to improve the model quality, through acquiring limited true labels from experts (the oracle). We further investigate two active learning strategies in terms of two uncertainty measures (i.e., CLUE and MUE) within the active learning framework. From our experimental results on eight simulation crowdsourcing datasets and four real-world crowdsourcing datasets with three popular consensus integration algorithms, we draw several conclusions as follows. (i) Our active learning framework with the input from the oracle significantly improves the generalization ability of the model learned from crowdsourcing data. (ii) Our two active learning strategies outperform a random active learning strategy.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"35 22","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Integrating Active Learning with Supervision for Crowdsourcing Generalization\",\"authors\":\"Zhenyu Shu, V. Sheng, Yang Zhang, Dianhong Wang, J. Zhang, Heng Chen\",\"doi\":\"10.1109/ICMLA.2015.13\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With various online crowdsourcing platforms, it is easy to collect multiple labels for the same examples from the crowd. Consensus integration algorithms can infer the estimated ground truths from the multiple label sets of these crowdsourcing datasets. However, it couldn't be avoided that these integrated (estimated) labels still contain noises. In order to further improve the performance of a model learned from data with these integrated labels, we propose an active learning framework to further improve the data quality, such that to improve the model quality, through acquiring limited true labels from experts (the oracle). We further investigate two active learning strategies in terms of two uncertainty measures (i.e., CLUE and MUE) within the active learning framework. From our experimental results on eight simulation crowdsourcing datasets and four real-world crowdsourcing datasets with three popular consensus integration algorithms, we draw several conclusions as follows. (i) Our active learning framework with the input from the oracle significantly improves the generalization ability of the model learned from crowdsourcing data. (ii) Our two active learning strategies outperform a random active learning strategy.\",\"PeriodicalId\":288427,\"journal\":{\"name\":\"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"35 22\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2015.13\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2015.13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Integrating Active Learning with Supervision for Crowdsourcing Generalization
With various online crowdsourcing platforms, it is easy to collect multiple labels for the same examples from the crowd. Consensus integration algorithms can infer the estimated ground truths from the multiple label sets of these crowdsourcing datasets. However, it couldn't be avoided that these integrated (estimated) labels still contain noises. In order to further improve the performance of a model learned from data with these integrated labels, we propose an active learning framework to further improve the data quality, such that to improve the model quality, through acquiring limited true labels from experts (the oracle). We further investigate two active learning strategies in terms of two uncertainty measures (i.e., CLUE and MUE) within the active learning framework. From our experimental results on eight simulation crowdsourcing datasets and four real-world crowdsourcing datasets with three popular consensus integration algorithms, we draw several conclusions as follows. (i) Our active learning framework with the input from the oracle significantly improves the generalization ability of the model learned from crowdsourcing data. (ii) Our two active learning strategies outperform a random active learning strategy.