Cross-worker joint modeling-based label integration for crowdsourcing

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Approximate Reasoning Pub Date : 2025-09-11 DOI:10.1016/j.ijar.2025.109570

Can Pan, Liangxiao Jiang, Shanshan Si

{"title":"Cross-worker joint modeling-based label integration for crowdsourcing","authors":"Can Pan, Liangxiao Jiang, Shanshan Si","doi":"10.1016/j.ijar.2025.109570","DOIUrl":null,"url":null,"abstract":"<div><div>In crowdsourcing learning, label integration algorithms are applied to infer each instance's integrated label from its multiple noisy label set. Recent advancements have demonstrated that worker modeling is an effective approach to improving label integration's performance. In real-world crowdsourced scenarios, however, each worker often annotates a few instances only, leading to insufficient worker modeling. To address this issue, we propose a novel cross-worker joint modeling-based label integration (CJMLI) algorithm. Different from existing algorithms that focus on modeling individual workers solely, CJMLI exploits cross-worker joint modeling to effectively mitigate the impact of insufficient worker modeling. Specifically, we first employ majority voting to get initial integrated labels and then apply them to estimate worker qualities. Subsequently, for each instance, we randomly select a subset of workers to estimate its class membership probabilities and then generate a weighted instance for each class. Next, we use the weighted instances to train a classifier. This process is executed several times to get multiple classifiers. Finally, we use weighted majority voting to fuse their predicted labels to infer the final integrated label of each instance. Extensive experiments demonstrate that CJMLI significantly outperforms all its competitors.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109570"},"PeriodicalIF":3.0000,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Approximate Reasoning","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0888613X25002117","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In crowdsourcing learning, label integration algorithms are applied to infer each instance's integrated label from its multiple noisy label set. Recent advancements have demonstrated that worker modeling is an effective approach to improving label integration's performance. In real-world crowdsourced scenarios, however, each worker often annotates a few instances only, leading to insufficient worker modeling. To address this issue, we propose a novel cross-worker joint modeling-based label integration (CJMLI) algorithm. Different from existing algorithms that focus on modeling individual workers solely, CJMLI exploits cross-worker joint modeling to effectively mitigate the impact of insufficient worker modeling. Specifically, we first employ majority voting to get initial integrated labels and then apply them to estimate worker qualities. Subsequently, for each instance, we randomly select a subset of workers to estimate its class membership probabilities and then generate a weighted instance for each class. Next, we use the weighted instances to train a classifier. This process is executed several times to get multiple classifiers. Finally, we use weighted majority voting to fuse their predicted labels to infer the final integrated label of each instance. Extensive experiments demonstrate that CJMLI significantly outperforms all its competitors.

查看原文本刊更多论文

基于跨工作者联合建模的众包标签集成

在众包学习中，标签集成算法用于从每个实例的多个噪声标签集中推断出每个实例的集成标签。最近的进展表明，工作者建模是提高标签集成性能的有效方法。然而，在真实的众包场景中，每个工作人员通常只注释几个实例，导致工作人员建模不足。为了解决这个问题，我们提出了一种新的基于跨工作者联合建模的标签集成（CJMLI）算法。与现有算法只关注单个工人的建模不同，CJMLI利用跨工人联合建模来有效减轻工人建模不足的影响。具体来说，我们首先采用多数投票来获得初始的集成标签，然后应用它们来估计工人的素质。随后，对于每个实例，我们随机选择一个工人子集来估计其类隶属概率，然后为每个类生成一个加权实例。接下来，我们使用加权实例来训练分类器。此过程执行多次以获得多个分类器。最后，我们使用加权多数投票来融合他们的预测标签，从而推断出每个实例的最终综合标签。大量的实验表明，CJMLI显著优于所有竞争对手。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Approximate Reasoning 工程技术-计算机：人工智能

CiteScore

6.90

自引率

12.80%

发文量

170

审稿时长

67 days

期刊介绍： The International Journal of Approximate Reasoning is intended to serve as a forum for the treatment of imprecision and uncertainty in Artificial and Computational Intelligence, covering both the foundations of uncertainty theories, and the design of intelligent systems for scientific and engineering applications. It publishes high-quality research papers describing theoretical developments or innovative applications, as well as review articles on topics of general interest. Relevant topics include, but are not limited to, probabilistic reasoning and Bayesian networks, imprecise probabilities, random sets, belief functions (Dempster-Shafer theory), possibility theory, fuzzy sets, rough sets, decision theory, non-additive measures and integrals, qualitative reasoning about uncertainty, comparative probability orderings, game-theoretic probability, default reasoning, nonstandard logics, argumentation systems, inconsistency tolerant reasoning, elicitation techniques, philosophical foundations and psychological models of uncertain reasoning. Domains of application for uncertain reasoning systems include risk analysis and assessment, information retrieval and database design, information fusion, machine learning, data and web mining, computer vision, image and signal processing, intelligent data analysis, statistics, multi-agent systems, etc.