Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing最新文献

筛选
英文 中文
Collect, Measure, Repeat: Reliability Factors for Responsible AI Data Collection 收集、测量、重复:负责任的人工智能数据收集的可靠性因素
Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing Pub Date : 2023-11-03 DOI: 10.1609/hcomp.v11i1.27547
Oana Inel, Tim Draws, Lora Aroyo
{"title":"Collect, Measure, Repeat: Reliability Factors for Responsible AI Data Collection","authors":"Oana Inel, Tim Draws, Lora Aroyo","doi":"10.1609/hcomp.v11i1.27547","DOIUrl":"https://doi.org/10.1609/hcomp.v11i1.27547","url":null,"abstract":"The rapid entry of machine learning approaches in our daily activities and high-stakes domains demands transparency and scrutiny of their fairness and reliability. To help gauge machine learning models' robustness, research typically focuses on the massive datasets used for their deployment, e.g., creating and maintaining documentation for understanding their origin, process of development, and ethical considerations. However, data collection for AI is still typically a one-off practice, and oftentimes datasets collected for a certain purpose or application are reused for a different problem. Additionally, dataset annotations may not be representative over time, contain ambiguous or erroneous annotations, or be unable to generalize across issues or domains. Recent research has shown these practices might lead to unfair, biased, or inaccurate outcomes. We argue that data collection for AI should be performed in a responsible manner where the quality of the data is thoroughly scrutinized and measured through a systematic set of appropriate metrics. In this paper, we propose a Responsible AI (RAI) methodology designed to guide the data collection with a set of metrics for an iterative in-depth analysis of the factors influencing the quality and reliability of the generated data. We propose a granular set of measurements to inform on the internal reliability of a dataset and its external stability over time. We validate our approach across nine existing datasets and annotation tasks and four content modalities. This approach impacts the assessment of data robustness used for AI applied in the real world, where diversity of users and content is eminent. Furthermore, it deals with fairness and accountability aspects in data collection by providing systematic and transparent quality analysis for data collections.","PeriodicalId":87339,"journal":{"name":"Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Cluster-Aware Transfer Learning for Bayesian Optimization of Personalized Preference Models 基于集群感知的个性化偏好模型贝叶斯优化迁移学习
Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing Pub Date : 2023-11-03 DOI: 10.1609/hcomp.v11i1.27558
Haruto Yamasaki, Masaki Matsubara, Hiroyoshi Ito, Yuta Nambu, Masahiro Kohjima, Yuki Kurauchi, Ryuji Yamamoto, Atsuyuki Morishima
{"title":"A Cluster-Aware Transfer Learning for Bayesian Optimization of Personalized Preference Models","authors":"Haruto Yamasaki, Masaki Matsubara, Hiroyoshi Ito, Yuta Nambu, Masahiro Kohjima, Yuki Kurauchi, Ryuji Yamamoto, Atsuyuki Morishima","doi":"10.1609/hcomp.v11i1.27558","DOIUrl":"https://doi.org/10.1609/hcomp.v11i1.27558","url":null,"abstract":"Obtaining personalized models of the crowd is an important issue in various applications, such as preference acquisition and user interaction customization. However, the crowd setting, in which we assume we have little knowledge about the person, brings the cold start problem, which may cause avoidable unpreferable interactions with the people. This paper proposes a cluster-aware transfer learning method for the Bayesian optimization of personalized models. The proposed method, called Cluster-aware Bayesian Optimization, is designed based on a known feature: user preferences are not completely independent but can be divided into clusters. It exploits the clustering information to efficiently find the preference of the crowds while avoiding unpreferable interactions. The results of our extensive experiments with different data sets show that the method is efficient for finding the most preferable items and effective in reducing the number of unpreferable interactions.","PeriodicalId":87339,"journal":{"name":"Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Does Human Collaboration Enhance the Accuracy of Identifying LLM-Generated Deepfake Texts? 人类协作是否提高了识别llm生成的深度假文本的准确性?
Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing Pub Date : 2023-11-03 DOI: 10.1609/hcomp.v11i1.27557
Adaku Uchendu, Jooyoung Lee, Hua Shen, Thai Le, Ting-Hao 'Kenneth' Huang, Dongwon Lee
{"title":"Does Human Collaboration Enhance the Accuracy of Identifying LLM-Generated Deepfake Texts?","authors":"Adaku Uchendu, Jooyoung Lee, Hua Shen, Thai Le, Ting-Hao 'Kenneth' Huang, Dongwon Lee","doi":"10.1609/hcomp.v11i1.27557","DOIUrl":"https://doi.org/10.1609/hcomp.v11i1.27557","url":null,"abstract":"Advances in Large Language Models (e.g., GPT-4, LLaMA) have improved the generation of coherent sentences resembling human writing on a large scale, resulting in the creation of so-called deepfake texts. However, this progress poses security and privacy concerns, necessitating effective solutions for distinguishing deepfake texts from human-written ones. Although prior works studied humans’ ability to detect deepfake texts, none has examined whether “collaboration” among humans improves the detection of deepfake texts. In this study, to address this gap of understanding on deepfake texts, we conducted experiments with two groups: (1) nonexpert individuals from the AMT platform and (2) writing experts from the Upwork platform. The results demonstrate that collaboration among humans can potentially improve the detection of deepfake texts for both groups, increasing detection accuracies by 6.36% for non-experts and 12.76% for experts, respectively, compared to individuals’ detection accuracies. We further analyze the explanations that humans used for detecting a piece of text as deepfake text, and find that the strongest indicator of deepfake texts is their lack of coherence and consistency. Our study provides useful insights for future tools and framework designs to facilitate the collaborative human detection of deepfake texts. The experiment datasets and AMT implementations are available at: https://github.com/huashen218/llm-deepfake-human-study.git","PeriodicalId":87339,"journal":{"name":"Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Where Does My Model Underperform? A Human Evaluation of Slice Discovery Algorithms 我的模型在哪里表现不佳?切片发现算法的人类评价
Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing Pub Date : 2023-11-03 DOI: 10.1609/hcomp.v11i1.27548
Nari Johnson, Ángel Alexander Cabrera, Gregory Plumb, Ameet Talwalkar
{"title":"Where Does My Model Underperform? A Human Evaluation of Slice Discovery Algorithms","authors":"Nari Johnson, Ángel Alexander Cabrera, Gregory Plumb, Ameet Talwalkar","doi":"10.1609/hcomp.v11i1.27548","DOIUrl":"https://doi.org/10.1609/hcomp.v11i1.27548","url":null,"abstract":"Machine learning (ML) models that achieve high average accuracy can still underperform on semantically coherent subsets (\"slices\") of data. This behavior can have significant societal consequences for the safety or bias of the model in deployment, but identifying these underperforming slices can be difficult in practice, especially in domains where practitioners lack access to group annotations to define coherent subsets of their data. Motivated by these challenges, ML researchers have developed new slice discovery algorithms that aim to group together coherent and high-error subsets of data. However, there has been little evaluation focused on whether these tools help humans form correct hypotheses about where (for which groups) their model underperforms. We conduct a controlled user study (N = 15) where we show 40 slices output by two state-of-the-art slice discovery algorithms to users, and ask them to form hypotheses about an object detection model. Our results provide positive evidence that these tools provide some benefit over a naive baseline, and also shed light on challenges faced by users during the hypothesis formation step. We conclude by discussing design opportunities for ML and HCI researchers. Our findings point to the importance of centering users when creating and evaluating new tools for slice discovery.","PeriodicalId":87339,"journal":{"name":"Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rethinking Quality Assurance for Crowdsourced Multi-ROI Image Segmentation 对众包多roi图像分割质量保证的再思考
Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing Pub Date : 2023-11-03 DOI: 10.1609/hcomp.v11i1.27552
Xiaolu Lu, David Ratcliffe, Tsu-Ting Kao, Aristarkh Tikhonov, Lester Litchfield, Craig Rodger, Kaier Wang
{"title":"Rethinking Quality Assurance for Crowdsourced Multi-ROI Image Segmentation","authors":"Xiaolu Lu, David Ratcliffe, Tsu-Ting Kao, Aristarkh Tikhonov, Lester Litchfield, Craig Rodger, Kaier Wang","doi":"10.1609/hcomp.v11i1.27552","DOIUrl":"https://doi.org/10.1609/hcomp.v11i1.27552","url":null,"abstract":"Collecting high quality annotations to construct an evaluation dataset is essential for assessing the true performance of machine learning models. One popular way of performing data annotation is via crowdsourcing, where quality can be of concern. Despite much prior work addressing the annotation quality problem in crowdsourcing generally, little has been discussed in detail for image segmentation tasks. These tasks often require pixel-level annotation accuracy, and is relatively complex when compared to image classification or object detection with bounding-boxes. In this paper, we focus on image segmentation annotation via crowdsourcing, where images may not have been collected in a controlled way. In this setting, the task of annotating may be non-trivial, where annotators may experience difficultly in differentiating between regions-of-interest (ROIs) and background pixels. We implement an annotation process and examine the effectiveness of several in-situ and manual quality assurance and quality control mechanisms. We implement an annotation process on a medical image annotation task and examine the effectiveness of several in-situ and manual quality assurance and quality control mechanisms. Our observations on this task are three-fold. Firstly, including an onboarding and a pilot phase improves quality assurance as annotators can familiarize themselves with the task, especially when the definition of ROIs is ambiguous. Secondly, we observe high variability of annotation times, leading us to believe it cannot be relied upon as a source of information for quality control. When performing agreement analysis, we also show that global-level inter-rater agreement is insufficient to provide useful information, especially when annotator skill levels vary. Thirdly, we recognize that reviewing all annotations can be time-consuming and often infeasible, and there currently exist no mechanisms to reduce the workload for reviewers. Therefore, we propose a method to create a priority list of images for review based on inter-rater agreement. Our experiments suggest that this method can be used to improve reviewer efficiency when compared to a baseline approach, especially if a fixed work budget is required.","PeriodicalId":87339,"journal":{"name":"Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Informing Users about Data Imputation: Exploring the Design Space for Dealing With Non-Responses 告知用户数据输入:探索处理非响应的设计空间
Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing Pub Date : 2023-11-03 DOI: 10.1609/hcomp.v11i1.27544
Ananya Bhattacharjee, Haochen Song, Xuening Wu, Justice Tomlinson, Mohi Reza, Akmar Ehsan Chowdhury, Nina Deliu, Thomas W. Price, Joseph Jay Williams
{"title":"Informing Users about Data Imputation: Exploring the Design Space for Dealing With Non-Responses","authors":"Ananya Bhattacharjee, Haochen Song, Xuening Wu, Justice Tomlinson, Mohi Reza, Akmar Ehsan Chowdhury, Nina Deliu, Thomas W. Price, Joseph Jay Williams","doi":"10.1609/hcomp.v11i1.27544","DOIUrl":"https://doi.org/10.1609/hcomp.v11i1.27544","url":null,"abstract":"Machine learning algorithms often require quantitative ratings from users to effectively predict helpful content. When these ratings are unavailable, systems make implicit assumptions or imputations to fill in the missing information; however, users are generally kept unaware of these processes. In our work, we explore ways of informing the users about system imputations, and experiment with imputed ratings and various explanations required by users to correct imputations. We investigate these approaches through the deployment of a text messaging probe to 26 participants to help them manage psychological wellbeing. We provide quantitative results to report users' reactions to correct vs incorrect imputations and potential risks of biasing their ratings. Using semi-structured interviews with participants, we characterize the potential trade-offs regarding user autonomy, and draw insights about alternative ways of involving users in the imputation process. Our findings provide useful directions for future research on communicating system imputation and interpreting user non-responses.","PeriodicalId":87339,"journal":{"name":"Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Crowd Worker Factors Influence Subjective Annotations: A Study of Tagging Misogynistic Hate Speech in Tweets 群体工作者因素如何影响主观注释:推文中厌女仇恨言论的标签研究
Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing Pub Date : 2023-11-03 DOI: 10.1609/hcomp.v11i1.27546
Danula Hettiachchi, Indigo Holcombe-James, Stephanie Livingstone, Anjalee De Silva, Matthew Lease, Flora D. Salim, Mark Sanderson
{"title":"How Crowd Worker Factors Influence Subjective Annotations: A Study of Tagging Misogynistic Hate Speech in Tweets","authors":"Danula Hettiachchi, Indigo Holcombe-James, Stephanie Livingstone, Anjalee De Silva, Matthew Lease, Flora D. Salim, Mark Sanderson","doi":"10.1609/hcomp.v11i1.27546","DOIUrl":"https://doi.org/10.1609/hcomp.v11i1.27546","url":null,"abstract":"Crowdsourced annotation is vital to both collecting labelled data to train and test automated content moderation systems and to support human-in-the-loop review of system decisions. However, annotation tasks such as judging hate speech are subjective and thus highly sensitive to biases stemming from annotator beliefs, characteristics and demographics. We conduct two crowdsourcing studies on Mechanical Turk to examine annotator bias in labelling sexist and misogynistic hate speech. Results from 109 annotators show that annotator political inclination, moral integrity, personality traits, and sexist attitudes significantly impact annotation accuracy and the tendency to tag content as hate speech. In addition, semi-structured interviews with nine crowd workers provide further insights regarding the influence of subjectivity on annotations. In exploring how workers interpret a task — shaped by complex negotiations between platform structures, task instructions, subjective motivations, and external contextual factors — we see annotations not only impacted by worker factors but also simultaneously shaped by the structures under which they labour.","PeriodicalId":87339,"journal":{"name":"Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Task as Context: A Sensemaking Perspective on Annotating Inter-Dependent Event Attributes with Non-Experts 任务作为上下文:用非专家对相互依赖的事件属性进行标注的意义构建视角
Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing Pub Date : 2023-11-03 DOI: 10.1609/hcomp.v11i1.27550
Tianyi Li, Ping Wang, Tian Shi, Yali Bian, Andy Esakia
{"title":"Task as Context: A Sensemaking Perspective on Annotating Inter-Dependent Event Attributes with Non-Experts","authors":"Tianyi Li, Ping Wang, Tian Shi, Yali Bian, Andy Esakia","doi":"10.1609/hcomp.v11i1.27550","DOIUrl":"https://doi.org/10.1609/hcomp.v11i1.27550","url":null,"abstract":"This paper explores the application of sensemaking theory to support non-expert crowds in intricate data annotation tasks. We investigate the influence of procedural context and data context on the annotation quality of novice crowds, defining procedural context as completing multiple related annotation tasks on the same data point, and data context as annotating multiple data points with semantic relevance. We conducted a controlled experiment involving 140 non-expert crowd workers, who generated 1400 event annotations across various procedural and data context levels. Assessments of annotations demonstrate that high procedural context positively impacts annotation quality, although this effect diminishes with lower data context. Notably, assigning multiple related tasks to novice annotators yields comparable quality to expert annotations, without costing additional time or effort. We discuss the trade-offs associated with procedural and data contexts and draw design implications for engaging non-experts in crowdsourcing complex annotation tasks.","PeriodicalId":87339,"journal":{"name":"Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Taxonomy of Human and ML Strengths in Decision-Making to Investigate Human-ML Complementarity 人类和机器学习在决策中的优势分类研究人类和机器学习的互补性
Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing Pub Date : 2023-11-03 DOI: 10.1609/hcomp.v11i1.27554
Charvi Rastogi, Liu Leqi, Kenneth Holstein, Hoda Heidari
{"title":"A Taxonomy of Human and ML Strengths in Decision-Making to Investigate Human-ML Complementarity","authors":"Charvi Rastogi, Liu Leqi, Kenneth Holstein, Hoda Heidari","doi":"10.1609/hcomp.v11i1.27554","DOIUrl":"https://doi.org/10.1609/hcomp.v11i1.27554","url":null,"abstract":"Hybrid human-ML systems increasingly make consequential decisions in a wide range of domains. These systems are often introduced with the expectation that the combined human-ML system will achieve complementary performance, that is, the combined decision-making system will be an improvement compared with either decision-making agent in isolation. However, empirical results have been mixed, and existing research rarely articulates the sources and mechanisms by which complementary performance is expected to arise. Our goal in this work is to provide conceptual tools to advance the way researchers reason and communicate about human-ML complementarity. Drawing upon prior literature in human psychology, machine learning, and human-computer interaction, we propose a taxonomy characterizing distinct ways in which human and ML-based decision-making can differ. In doing so, we conceptually map potential mechanisms by which combining human and ML decision-making may yield complementary performance, developing a language for the research community to reason about design of hybrid systems in any decision-making domain. To illustrate how our taxonomy can be used to investigate complementarity, we provide a mathematical aggregation framework to examine enabling conditions for complementarity. Through synthetic simulations, we demonstrate how this framework can be used to explore specific aspects of our taxonomy and shed light on the optimal mechanisms for combining human-ML judgments.","PeriodicalId":87339,"journal":{"name":"Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Crowdsourced Clustering via Active Querying: Practical Algorithm with Theoretical Guarantees 基于主动查询的众包聚类:具有理论保证的实用算法
Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing Pub Date : 2023-11-03 DOI: 10.1609/hcomp.v11i1.27545
Yi Chen, Ramya Korlakai Vinayak, Babak Hassibi
{"title":"Crowdsourced Clustering via Active Querying: Practical Algorithm with Theoretical Guarantees","authors":"Yi Chen, Ramya Korlakai Vinayak, Babak Hassibi","doi":"10.1609/hcomp.v11i1.27545","DOIUrl":"https://doi.org/10.1609/hcomp.v11i1.27545","url":null,"abstract":"We consider the problem of clustering n items into K disjoint clusters using noisy answers from crowdsourced workers to pairwise queries of the type: “Are items i and j from the same cluster?” We propose a novel, practical, simple, and computationally efficient active querying algorithm for crowdsourced clustering. Furthermore, our algorithm does not require knowledge of unknown problem parameters. We show that our algorithm succeeds in recovering the clusters when the crowdworkers provide answers with an error probability less than 1/2 and provide sample complexity bounds on the number of queries made by our algorithm to guarantee successful clustering. While the bounds depend on the error probabilities, the algorithm itself does not require this knowledge. In addition to the theoretical guarantee, we implement and deploy the proposed algorithm on a real crowdsourcing platform to characterize its performance in real-world settings. Based on both the theoretical and the empirical results, we observe that while the total number of queries made by the active clustering algorithm is order-wise better than random querying, the advantage applies most conspicuously when the datasets have small clusters. For datasets with large enough clusters, passive querying can often be more efficient in practice. Our observations and practically implementable active clustering algorithm can inform and aid the design of real-world crowdsourced clustering systems. We make the dataset collected through this work publicly available (and the code to run such experiments).","PeriodicalId":87339,"journal":{"name":"Proceedings of the ... AAAI Conference on Human Computation and Crowdsourcing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135873495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信