多选项人群标签的联合生成-判别聚合模型

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI:10.1145/3159652.3159672

Kamran Ghasedi Dizaji, Yanhua Yang, Heng Huang

{"title":"多选项人群标签的联合生成-判别聚合模型","authors":"Kamran Ghasedi Dizaji, Yanhua Yang, Heng Huang","doi":"10.1145/3159652.3159672","DOIUrl":null,"url":null,"abstract":"Although some crowdsourcing aggregation models have been introduced to aggregate noisy crowd labels, these models mostly consider single-option (i.e. discrete) crowd labels as the input variables, and are not compatible with multi-option (i.e. non-deterministic) crowd data. In this paper, we propose a novel joint generative-discriminative aggregation model, which is able to efficiently deal with both single-option and multi-option crowd labels. Considering the confidence of workers for each option as the input data, we first introduce a new discriminative aggregation model, called Constrained Weighted Majority Voting (CWMVL1), which improves the performance of majority voting method. CWMVL1 considers flexible reliability parameters for crowd workers, employs L1-norm loss function to deal with noisy crowd data, and includes optimization constraints to have probabilistic outputs. We prove that our object is convex, and derive an efficient optimization algorithm. Moreover, we integrate the discriminative CWMVL1 model with a generative model, resulting in a powerful joint aggregation model. Combination of these sub-models is obtained in a probabilistic framework rather than a heuristic way. For our joint model, we derive an efficient optimization algorithm, which alternates between updating the parameters and estimating the potential true labels. Experimental results indicate that the proposed aggregation models achieve superior or competitive results in comparison with the state-of-the-art models on single-option and multi-option crowd datasets, while having faster convergence rates and more reliable predictions.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Joint Generative-Discriminative Aggregation Model for Multi-Option Crowd Labels\",\"authors\":\"Kamran Ghasedi Dizaji, Yanhua Yang, Heng Huang\",\"doi\":\"10.1145/3159652.3159672\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although some crowdsourcing aggregation models have been introduced to aggregate noisy crowd labels, these models mostly consider single-option (i.e. discrete) crowd labels as the input variables, and are not compatible with multi-option (i.e. non-deterministic) crowd data. In this paper, we propose a novel joint generative-discriminative aggregation model, which is able to efficiently deal with both single-option and multi-option crowd labels. Considering the confidence of workers for each option as the input data, we first introduce a new discriminative aggregation model, called Constrained Weighted Majority Voting (CWMVL1), which improves the performance of majority voting method. CWMVL1 considers flexible reliability parameters for crowd workers, employs L1-norm loss function to deal with noisy crowd data, and includes optimization constraints to have probabilistic outputs. We prove that our object is convex, and derive an efficient optimization algorithm. Moreover, we integrate the discriminative CWMVL1 model with a generative model, resulting in a powerful joint aggregation model. Combination of these sub-models is obtained in a probabilistic framework rather than a heuristic way. For our joint model, we derive an efficient optimization algorithm, which alternates between updating the parameters and estimating the potential true labels. Experimental results indicate that the proposed aggregation models achieve superior or competitive results in comparison with the state-of-the-art models on single-option and multi-option crowd datasets, while having faster convergence rates and more reliable predictions.\",\"PeriodicalId\":401247,\"journal\":{\"name\":\"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining\",\"volume\":\"79 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-02-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3159652.3159672\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3159652.3159672","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

虽然已经引入了一些众包聚合模型来聚合有噪声的人群标签，但这些模型大多将单选项(即离散的)人群标签作为输入变量，与多选项(即不确定性的)人群数据不兼容。本文提出了一种新的联合生成-判别聚合模型，该模型能够有效地处理单选项和多选项人群标签。考虑到工人对每个选项的置信度作为输入数据，我们首先引入了一种新的判别聚合模型，称为约束加权多数投票(CWMVL1)，它提高了多数投票方法的性能。CWMVL1为人群工作人员考虑了灵活的可靠性参数，采用l1范数损失函数处理有噪声的人群数据，并包含优化约束以获得概率输出。我们证明了我们的目标是凸的，并推导了一个有效的优化算法。此外，我们将判别式CWMVL1模型与生成式模型相结合，得到了一个强大的联合聚合模型。这些子模型的组合是在一个概率框架而不是启发式的方式下得到的。对于我们的联合模型，我们推导了一种高效的优化算法，该算法在更新参数和估计潜在真标签之间交替进行。实验结果表明，在单选项和多选项人群数据集上，所提出的聚合模型与目前最先进的模型相比取得了更好或更具竞争力的结果，同时具有更快的收敛速度和更可靠的预测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Joint Generative-Discriminative Aggregation Model for Multi-Option Crowd Labels

Although some crowdsourcing aggregation models have been introduced to aggregate noisy crowd labels, these models mostly consider single-option (i.e. discrete) crowd labels as the input variables, and are not compatible with multi-option (i.e. non-deterministic) crowd data. In this paper, we propose a novel joint generative-discriminative aggregation model, which is able to efficiently deal with both single-option and multi-option crowd labels. Considering the confidence of workers for each option as the input data, we first introduce a new discriminative aggregation model, called Constrained Weighted Majority Voting (CWMVL1), which improves the performance of majority voting method. CWMVL1 considers flexible reliability parameters for crowd workers, employs L1-norm loss function to deal with noisy crowd data, and includes optimization constraints to have probabilistic outputs. We prove that our object is convex, and derive an efficient optimization algorithm. Moreover, we integrate the discriminative CWMVL1 model with a generative model, resulting in a powerful joint aggregation model. Combination of these sub-models is obtained in a probabilistic framework rather than a heuristic way. For our joint model, we derive an efficient optimization algorithm, which alternates between updating the parameters and estimating the potential true labels. Experimental results indicate that the proposed aggregation models achieve superior or competitive results in comparison with the state-of-the-art models on single-option and multi-option crowd datasets, while having faster convergence rates and more reliable predictions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

自引率

0.00%

发文量