FedBERT:当联邦学习遇到预训练

ACM Transactions on Intelligent Systems and Technology (TIST) Pub Date : 2022-02-04 DOI:10.1145/3510033

Yuanyishu Tian, Yao Wan, Lingjuan Lyu, Dezhong Yao, Hai Jin, Lichao Sun

{"title":"FedBERT:当联邦学习遇到预训练","authors":"Yuanyishu Tian, Yao Wan, Lingjuan Lyu, Dezhong Yao, Hai Jin, Lichao Sun","doi":"10.1145/3510033","DOIUrl":null,"url":null,"abstract":"The fast growth of pre-trained models (PTMs) has brought natural language processing to a new era, which has become a dominant technique for various natural language processing (NLP) applications. Every user can download the weights of PTMs, then fine-tune the weights for a task on the local side. However, the pre-training of a model relies heavily on accessing a large-scale of training data and requires a vast amount of computing resources. These strict requirements make it impossible for any single client to pre-train such a model. To grant clients with limited computing capability to participate in pre-training a large model, we propose a new learning approach, FedBERT, that takes advantage of the federated learning and split learning approaches, resorting to pre-training BERT in a federated way. FedBERT can prevent sharing the raw data information and obtain excellent performance. Extensive experiments on seven GLUE tasks demonstrate that FedBERT can maintain its effectiveness without communicating to the sensitive local data of clients.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":"{\"title\":\"FedBERT: When Federated Learning Meets Pre-training\",\"authors\":\"Yuanyishu Tian, Yao Wan, Lingjuan Lyu, Dezhong Yao, Hai Jin, Lichao Sun\",\"doi\":\"10.1145/3510033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The fast growth of pre-trained models (PTMs) has brought natural language processing to a new era, which has become a dominant technique for various natural language processing (NLP) applications. Every user can download the weights of PTMs, then fine-tune the weights for a task on the local side. However, the pre-training of a model relies heavily on accessing a large-scale of training data and requires a vast amount of computing resources. These strict requirements make it impossible for any single client to pre-train such a model. To grant clients with limited computing capability to participate in pre-training a large model, we propose a new learning approach, FedBERT, that takes advantage of the federated learning and split learning approaches, resorting to pre-training BERT in a federated way. FedBERT can prevent sharing the raw data information and obtain excellent performance. Extensive experiments on seven GLUE tasks demonstrate that FedBERT can maintain its effectiveness without communicating to the sensitive local data of clients.\",\"PeriodicalId\":123526,\"journal\":{\"name\":\"ACM Transactions on Intelligent Systems and Technology (TIST)\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"39\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Intelligent Systems and Technology (TIST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3510033\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology (TIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 39

摘要

预训练模型(ptm)的快速发展将自然语言处理带入了一个新时代，它已成为各种自然语言处理(NLP)应用的主导技术。每个用户都可以下载ptm的权重，然后在本地对任务的权重进行微调。然而，模型的预训练严重依赖于访问大量的训练数据，需要大量的计算资源。这些严格的要求使得任何单个客户都不可能预先训练这样的模型。为了使计算能力有限的客户端能够参与大型模型的预训练，我们提出了一种新的学习方法，FedBERT，它利用了联邦学习和分裂学习方法，以联邦的方式对BERT进行预训练。FedBERT可以防止原始数据信息的共享，并获得优异的性能。在7个GLUE任务上的大量实验表明，FedBERT可以在不与客户端的敏感本地数据通信的情况下保持其有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FedBERT: When Federated Learning Meets Pre-training

The fast growth of pre-trained models (PTMs) has brought natural language processing to a new era, which has become a dominant technique for various natural language processing (NLP) applications. Every user can download the weights of PTMs, then fine-tune the weights for a task on the local side. However, the pre-training of a model relies heavily on accessing a large-scale of training data and requires a vast amount of computing resources. These strict requirements make it impossible for any single client to pre-train such a model. To grant clients with limited computing capability to participate in pre-training a large model, we propose a new learning approach, FedBERT, that takes advantage of the federated learning and split learning approaches, resorting to pre-training BERT in a federated way. FedBERT can prevent sharing the raw data information and obtain excellent performance. Extensive experiments on seven GLUE tasks demonstrate that FedBERT can maintain its effectiveness without communicating to the sensitive local data of clients.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Intelligent Systems and Technology (TIST)

自引率

0.00%

发文量