A Marketplace for Data: An Algorithmic Solution

Anish Agarwal, M. Dahleh, Tuhin Sarkar
{"title":"A Marketplace for Data: An Algorithmic Solution","authors":"Anish Agarwal, M. Dahleh, Tuhin Sarkar","doi":"10.1145/3328526.3329589","DOIUrl":null,"url":null,"abstract":"In this work, we aim to design a data marketplace; a robust real-time matching mechanism to efficiently buy and sell training data for Machine Learning tasks. While the monetization of data and pre-trained models is an essential focus of industry today, there does not exist a market mechanism to price training data and match buyers to sellers while still addressing the associated (computational and other) complexity. The challenge in creating such a market stems from the very nature of data as an asset: (i) it is freely replicable; (ii) its value is inherently combinatorial due to correlation with signal in other data; (iii) prediction tasks and the value of accuracy vary widely; (iv) usefulness of training data is difficult to verify a priori without first applying it to a prediction task. As our main contributions we: (i) propose a mathematical model for a two-sided data market and formally define the key associated challenges; (ii) construct algorithms for such a market to function and analyze how they meet the challenges defined. We highlight two technical contributions: (i) a new notion of \"fairness\" required for cooperative games with freely replicable goods; (ii) a truthful, zero regret mechanism to auction a class of combinatorial goods based on utilizing Myerson's payment function and the Multiplicative Weights algorithm. These might be of independent interest.","PeriodicalId":416173,"journal":{"name":"Proceedings of the 2019 ACM Conference on Economics and Computation","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"151","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 ACM Conference on Economics and Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3328526.3329589","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 151

Abstract

In this work, we aim to design a data marketplace; a robust real-time matching mechanism to efficiently buy and sell training data for Machine Learning tasks. While the monetization of data and pre-trained models is an essential focus of industry today, there does not exist a market mechanism to price training data and match buyers to sellers while still addressing the associated (computational and other) complexity. The challenge in creating such a market stems from the very nature of data as an asset: (i) it is freely replicable; (ii) its value is inherently combinatorial due to correlation with signal in other data; (iii) prediction tasks and the value of accuracy vary widely; (iv) usefulness of training data is difficult to verify a priori without first applying it to a prediction task. As our main contributions we: (i) propose a mathematical model for a two-sided data market and formally define the key associated challenges; (ii) construct algorithms for such a market to function and analyze how they meet the challenges defined. We highlight two technical contributions: (i) a new notion of "fairness" required for cooperative games with freely replicable goods; (ii) a truthful, zero regret mechanism to auction a class of combinatorial goods based on utilizing Myerson's payment function and the Multiplicative Weights algorithm. These might be of independent interest.
数据市场:一个算法解决方案
在这项工作中,我们的目标是设计一个数据市场;一个强大的实时匹配机制,有效地购买和出售机器学习任务的训练数据。虽然数据和预训练模型的货币化是当今行业的一个重要焦点,但目前还没有一个市场机制来为训练数据定价,并在解决相关(计算和其他)复杂性的同时,将买家和卖家匹配起来。创建这样一个市场的挑战源于数据作为一种资产的本质:(i)它可以自由复制;(ii)其值由于与其他数据中的信号相关而具有固有的组合性;(iii)预测任务和准确度值差异很大;(iv)如果不首先将训练数据应用于预测任务,则很难先验地验证训练数据的有用性。作为我们的主要贡献,我们:(i)提出了一个双边数据市场的数学模型,并正式定义了关键的相关挑战;(ii)构建这样一个市场运行的算法,并分析它们如何应对所定义的挑战。我们强调了两项技术贡献:(1)提供免费复制商品的合作游戏所需的“公平”新概念;(ii)基于Myerson支付函数和乘法权重算法的拍卖一类组合商品的真实、零后悔机制。这些可能是独立的利益。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信