TIFF:联邦学习的标记化激励

Q1 Computer Science

IEEE Cloud Computing Pub Date : 2022-07-01 DOI:10.1109/CLOUD55607.2022.00064

Jingoo Han, Ahmad Faraz Khan, Syed Zawad, Ali Anwar, Nathalie Baracaldo Angel, Yi Zhou, Feng Yan, A. Butt

{"title":"TIFF:联邦学习的标记化激励","authors":"Jingoo Han, Ahmad Faraz Khan, Syed Zawad, Ali Anwar, Nathalie Baracaldo Angel, Yi Zhou, Feng Yan, A. Butt","doi":"10.1109/CLOUD55607.2022.00064","DOIUrl":null,"url":null,"abstract":"In federated learning (FL), clients collectively train a global machine learning model with their own local data. Without sharing sensitive raw data, each client in FL only sends updated weights to consider privacy and security concerns. Most of existing FL works focus mainly on improving model accuracy and training time, but only a few works focus on FL incentive mechanisms. To build a high performance model after FL training, clients need to provide high quality and large amounts of data. However, in real FL scenarios, high-quality clients are reluctant to participate in FL process without reasonable compensation, because clients are self-interested and other clients can be business competitors. Even participation incurs some cost for contributing to the FL model with their local dataset. To address this problem, we propose TIFF, a novel tokenized incentive mechanism, where tokens are used as a means of paying for the services of providing participants and the training infrastructure. Without payment delays, participation can be monetized as both providers and consumers, which promotes continued long-term participation of high-quality data parties. Additionally, paid tokens are reimbursed to each client as consumers according to our newly proposed metrics (such as token reduction ratio and utility improvement ratio), which keeps clients engaged in FL process as consumers. To measure data quality, accuracy is calculated in training without additional overheads. We leverage historical accuracy records and random exploration to select high-utility participants and to prevent overfitting. Results show that TIFF provides more tokens to normal providers by up to 6.9% and less tokens to malicious providers by up to 18.1%, achieving improvement of the final model accuracy by up to 7.4%, compared to the default approach.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"103 1","pages":"407-416"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"TIFF: Tokenized Incentive for Federated Learning\",\"authors\":\"Jingoo Han, Ahmad Faraz Khan, Syed Zawad, Ali Anwar, Nathalie Baracaldo Angel, Yi Zhou, Feng Yan, A. Butt\",\"doi\":\"10.1109/CLOUD55607.2022.00064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In federated learning (FL), clients collectively train a global machine learning model with their own local data. Without sharing sensitive raw data, each client in FL only sends updated weights to consider privacy and security concerns. Most of existing FL works focus mainly on improving model accuracy and training time, but only a few works focus on FL incentive mechanisms. To build a high performance model after FL training, clients need to provide high quality and large amounts of data. However, in real FL scenarios, high-quality clients are reluctant to participate in FL process without reasonable compensation, because clients are self-interested and other clients can be business competitors. Even participation incurs some cost for contributing to the FL model with their local dataset. To address this problem, we propose TIFF, a novel tokenized incentive mechanism, where tokens are used as a means of paying for the services of providing participants and the training infrastructure. Without payment delays, participation can be monetized as both providers and consumers, which promotes continued long-term participation of high-quality data parties. Additionally, paid tokens are reimbursed to each client as consumers according to our newly proposed metrics (such as token reduction ratio and utility improvement ratio), which keeps clients engaged in FL process as consumers. To measure data quality, accuracy is calculated in training without additional overheads. We leverage historical accuracy records and random exploration to select high-utility participants and to prevent overfitting. Results show that TIFF provides more tokens to normal providers by up to 6.9% and less tokens to malicious providers by up to 18.1%, achieving improvement of the final model accuracy by up to 7.4%, compared to the default approach.\",\"PeriodicalId\":54281,\"journal\":{\"name\":\"IEEE Cloud Computing\",\"volume\":\"103 1\",\"pages\":\"407-416\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Cloud Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLOUD55607.2022.00064\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLOUD55607.2022.00064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 5

摘要

在联邦学习(FL)中，客户端使用自己的本地数据共同训练全局机器学习模型。在不共享敏感原始数据的情况下，FL中的每个客户端只发送更新的权重，以考虑隐私和安全问题。现有的人工智能研究大多集中在提高模型精度和训练时间上，而对人工智能激励机制的研究较少。为了在FL培训后建立一个高性能的模型，客户需要提供高质量和大量的数据。然而，在真实的FL场景中，如果没有合理的补偿，高质量的客户是不愿意参与FL过程的，因为客户是自利的，其他客户可能是业务竞争对手。即使是参与，也会因为使用本地数据集为FL模型做出贡献而产生一些成本。为了解决这个问题，我们提出了TIFF，这是一种新的代币化激励机制，其中代币被用作支付提供参与者和培训基础设施的服务的手段。在没有支付延迟的情况下，作为提供商和消费者的参与都可以货币化，这促进了高质量数据各方的持续长期参与。此外，根据我们新提出的指标(如代币减少率和效用改进率)，支付的代币作为消费者补偿给每个客户端，这使客户端作为消费者参与FL流程。为了测量数据质量，准确度是在训练中计算的，没有额外的开销。我们利用历史准确性记录和随机探索来选择高效用参与者并防止过拟合。结果表明，与默认方法相比，TIFF向正常提供者提供的令牌最多可达6.9%，向恶意提供者提供的令牌最多可达18.1%，最终模型精度提高了7.4%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

TIFF: Tokenized Incentive for Federated Learning

In federated learning (FL), clients collectively train a global machine learning model with their own local data. Without sharing sensitive raw data, each client in FL only sends updated weights to consider privacy and security concerns. Most of existing FL works focus mainly on improving model accuracy and training time, but only a few works focus on FL incentive mechanisms. To build a high performance model after FL training, clients need to provide high quality and large amounts of data. However, in real FL scenarios, high-quality clients are reluctant to participate in FL process without reasonable compensation, because clients are self-interested and other clients can be business competitors. Even participation incurs some cost for contributing to the FL model with their local dataset. To address this problem, we propose TIFF, a novel tokenized incentive mechanism, where tokens are used as a means of paying for the services of providing participants and the training infrastructure. Without payment delays, participation can be monetized as both providers and consumers, which promotes continued long-term participation of high-quality data parties. Additionally, paid tokens are reimbursed to each client as consumers according to our newly proposed metrics (such as token reduction ratio and utility improvement ratio), which keeps clients engaged in FL process as consumers. To measure data quality, accuracy is calculated in training without additional overheads. We leverage historical accuracy records and random exploration to select high-utility participants and to prevent overfitting. Results show that TIFF provides more tokens to normal providers by up to 6.9% and less tokens to malicious providers by up to 18.1%, achieving improvement of the final model accuracy by up to 7.4%, compared to the default approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Cloud Computing Computer Science-Computer Networks and Communications

CiteScore

11.20

自引率

0.00%

发文量

期刊介绍： Cessation. IEEE Cloud Computing is committed to the timely publication of peer-reviewed articles that provide innovative research ideas, applications results, and case studies in all areas of cloud computing. Topics relating to novel theory, algorithms, performance analyses and applications of techniques are covered. More specifically: Cloud software, Cloud security, Trade-offs between privacy and utility of cloud, Cloud in the business environment, Cloud economics, Cloud governance, Migrating to the cloud, Cloud standards, Development tools, Backup and recovery, Interoperability, Applications management, Data analytics, Communications protocols, Mobile cloud, Private clouds, Liability issues for data loss on clouds, Data integration, Big data, Cloud education, Cloud skill sets, Cloud energy consumption, The architecture of cloud computing, Applications in commerce, education, and industry, Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS), Business Process as a Service (BPaaS)