Quality-aided Annotation Service Selection in MLaaS Market

Shanyang Jiang, Lan Zhang
{"title":"Quality-aided Annotation Service Selection in MLaaS Market","authors":"Shanyang Jiang, Lan Zhang","doi":"10.1109/IWQoS54832.2022.9812877","DOIUrl":null,"url":null,"abstract":"The vibrant markets offering data annotation services are fast-growing and play an important part in machine learning. While many multi-label prediction services are available, it is challenging for consumers to decide which services to use for their own tasks and budgets due to the heterogeneity in those services’ labeling categories, labeling quality and price. In this paper, we focus on a practical problem of obtaining high-quality multi-label annotation data from multiple services within a budget constraint. We propose a framework that firstly parameterizes the labeling generation based on the constructed Probabilistic Graph Model, and designs an Expectation Maximization(EM)-based iteration algorithm to estimate the service labeling quality and task truth distribution. Then we transform the annotation service selection strategy into an adaptive submodular maximization coverage problem, which motivates us to design an adaptive random greedy algorithm with a constant approximation ratio 1−1/e. We evaluate our design on both real-world experiments and a series of simulations on various machine learning models and real datasets. These experiments will show that our method has more accuracy and reliability improvements.","PeriodicalId":353365,"journal":{"name":"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)","volume":"171 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWQoS54832.2022.9812877","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The vibrant markets offering data annotation services are fast-growing and play an important part in machine learning. While many multi-label prediction services are available, it is challenging for consumers to decide which services to use for their own tasks and budgets due to the heterogeneity in those services’ labeling categories, labeling quality and price. In this paper, we focus on a practical problem of obtaining high-quality multi-label annotation data from multiple services within a budget constraint. We propose a framework that firstly parameterizes the labeling generation based on the constructed Probabilistic Graph Model, and designs an Expectation Maximization(EM)-based iteration algorithm to estimate the service labeling quality and task truth distribution. Then we transform the annotation service selection strategy into an adaptive submodular maximization coverage problem, which motivates us to design an adaptive random greedy algorithm with a constant approximation ratio 1−1/e. We evaluate our design on both real-world experiments and a series of simulations on various machine learning models and real datasets. These experiments will show that our method has more accuracy and reliability improvements.
MLaaS市场中的质量辅助标注服务选择
提供数据注释服务的充满活力的市场正在快速增长,并在机器学习中发挥着重要作用。虽然有许多多标签预测服务可用,但由于这些服务的标签类别、标签质量和价格的异质性,消费者决定使用哪些服务来完成自己的任务和预算是具有挑战性的。在本文中,我们重点研究了在预算约束下从多个服务中获得高质量的多标签标注数据的实际问题。提出了一种基于构建的概率图模型参数化标记生成的框架,并设计了一种基于期望最大化的迭代算法来估计服务标记质量和任务真值分布。然后,我们将标注服务选择策略转化为自适应子模最大化覆盖问题,这激励我们设计一个具有常数近似比1−1/e的自适应随机贪婪算法。我们在真实世界的实验和各种机器学习模型和真实数据集的一系列模拟中评估我们的设计。实验结果表明,该方法具有更高的准确性和可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信