基础教育模式时代的联盟式学习还存在吗?

Nathalie Baracaldo
{"title":"基础教育模式时代的联盟式学习还存在吗?","authors":"Nathalie Baracaldo","doi":"10.1609/aaaiss.v3i1.31213","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) has arisen as an alternative to collecting large amounts of data in a central place to train a machine learning (ML) model. FL is privacy-friendly, allowing multiple parties to collaboratively train an ML model without exchanging or transmitting their training data. For this purpose, an aggregator iteratively coordinates the training process among parties, and parties simply share with the aggregator model updates, which contain information pertinent to the model such as neural network weights. Besides privacy, generalization has been another key driver for FL: parties who do not have enough data to train a good performing model by themselves can now engage in FL to obtain an ML model suitable for their tasks. Products and real applications in the industry and consumer space have demonstrated the power of this learning paradigm.\n\n\nRecently, foundation models have taken the AI community by storm, promising to solve the shortage of labeled data. A foundation model is a powerful model that can be recycled for a variety of use cases by applying techniques such as zero-shot learning and full or parameter-efficient fine tuning. The premise is that the amount of data required to fine tune a foundation model for a new task is much smaller than fully training a traditional model from scratch. The reason why this is the case is that a good foundation model has already learned relevant general representations, and thus, adapting it to a new task only requires a minimal number of additional samples. This raises the question: Is FL still alive in the era of foundation models?\n\n\nIn this talk, I will address this question. I will present some use cases where FL is very much alive. In these use cases, finding a foundation model with a desired representation is difficult if not impossible. With this pragmatic point of view, I hope to shed some light into a real use case where disparate private data is available in isolation at different parties and where labels may be located at a single party that doesn’t have any other information, making it impossible for a single party to train a model on its own. Furthermore, in some vertically-partitioned scenarios, cleaning data is not an option due to privacy-related reasons and it is not clear how to apply foundation models. Finally, I will also go over a few other requirements that are often overlooked, such as unlearning of data and its implications for the lifecycle management of FL and systems based on foundation models.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Is Federated Learning Still Alive in the Foundation Model Era?\",\"authors\":\"Nathalie Baracaldo\",\"doi\":\"10.1609/aaaiss.v3i1.31213\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated learning (FL) has arisen as an alternative to collecting large amounts of data in a central place to train a machine learning (ML) model. FL is privacy-friendly, allowing multiple parties to collaboratively train an ML model without exchanging or transmitting their training data. For this purpose, an aggregator iteratively coordinates the training process among parties, and parties simply share with the aggregator model updates, which contain information pertinent to the model such as neural network weights. Besides privacy, generalization has been another key driver for FL: parties who do not have enough data to train a good performing model by themselves can now engage in FL to obtain an ML model suitable for their tasks. Products and real applications in the industry and consumer space have demonstrated the power of this learning paradigm.\\n\\n\\nRecently, foundation models have taken the AI community by storm, promising to solve the shortage of labeled data. A foundation model is a powerful model that can be recycled for a variety of use cases by applying techniques such as zero-shot learning and full or parameter-efficient fine tuning. The premise is that the amount of data required to fine tune a foundation model for a new task is much smaller than fully training a traditional model from scratch. The reason why this is the case is that a good foundation model has already learned relevant general representations, and thus, adapting it to a new task only requires a minimal number of additional samples. This raises the question: Is FL still alive in the era of foundation models?\\n\\n\\nIn this talk, I will address this question. I will present some use cases where FL is very much alive. In these use cases, finding a foundation model with a desired representation is difficult if not impossible. With this pragmatic point of view, I hope to shed some light into a real use case where disparate private data is available in isolation at different parties and where labels may be located at a single party that doesn’t have any other information, making it impossible for a single party to train a model on its own. Furthermore, in some vertically-partitioned scenarios, cleaning data is not an option due to privacy-related reasons and it is not clear how to apply foundation models. Finally, I will also go over a few other requirements that are often overlooked, such as unlearning of data and its implications for the lifecycle management of FL and systems based on foundation models.\",\"PeriodicalId\":516827,\"journal\":{\"name\":\"Proceedings of the AAAI Symposium Series\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the AAAI Symposium Series\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1609/aaaiss.v3i1.31213\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the AAAI Symposium Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aaaiss.v3i1.31213","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

联邦学习(FL)是在一个中心位置收集大量数据以训练机器学习(ML)模型的一种替代方法。联合学习对隐私友好,允许多方协作训练 ML 模型,而无需交换或传输训练数据。为此,聚合器会反复协调各方的训练过程,各方只需与聚合器共享模型更新,其中包含神经网络权重等与模型相关的信息。除了隐私之外,通用化也是 FL 的另一个关键驱动因素:如果各方没有足够的数据来自行训练一个性能良好的模型,现在可以通过 FL 来获得适合其任务的 ML 模型。最近,基础模型在人工智能界掀起了一场风暴,有望解决标注数据短缺的问题。基础模型是一种功能强大的模型,通过应用零点学习和全面或参数高效微调等技术,可在各种用例中循环使用。前提是,针对新任务微调基础模型所需的数据量要比从头开始训练一个传统模型小得多。之所以会出现这种情况,是因为一个好的基础模型已经学会了相关的一般表征,因此,调整它以适应新任务只需要极少量的额外样本。这就提出了一个问题:在本讲座中,我将探讨这个问题。我将介绍一些 FL 非常活跃的用例。在这些用例中,找到一个具有所需表征的基础模型即使不是不可能,也是很困难的。在这种情况下,标签可能位于没有任何其他信息的单方,这使得单方无法独立训练模型。此外,在一些垂直分区的场景中,由于隐私相关的原因,清洗数据并不是一种选择,而且也不清楚如何应用基础模型。最后,我还将介绍一些经常被忽视的其他要求,如数据的非学习性及其对基于基础模型的 FL 和系统的生命周期管理的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Is Federated Learning Still Alive in the Foundation Model Era?
Federated learning (FL) has arisen as an alternative to collecting large amounts of data in a central place to train a machine learning (ML) model. FL is privacy-friendly, allowing multiple parties to collaboratively train an ML model without exchanging or transmitting their training data. For this purpose, an aggregator iteratively coordinates the training process among parties, and parties simply share with the aggregator model updates, which contain information pertinent to the model such as neural network weights. Besides privacy, generalization has been another key driver for FL: parties who do not have enough data to train a good performing model by themselves can now engage in FL to obtain an ML model suitable for their tasks. Products and real applications in the industry and consumer space have demonstrated the power of this learning paradigm. Recently, foundation models have taken the AI community by storm, promising to solve the shortage of labeled data. A foundation model is a powerful model that can be recycled for a variety of use cases by applying techniques such as zero-shot learning and full or parameter-efficient fine tuning. The premise is that the amount of data required to fine tune a foundation model for a new task is much smaller than fully training a traditional model from scratch. The reason why this is the case is that a good foundation model has already learned relevant general representations, and thus, adapting it to a new task only requires a minimal number of additional samples. This raises the question: Is FL still alive in the era of foundation models? In this talk, I will address this question. I will present some use cases where FL is very much alive. In these use cases, finding a foundation model with a desired representation is difficult if not impossible. With this pragmatic point of view, I hope to shed some light into a real use case where disparate private data is available in isolation at different parties and where labels may be located at a single party that doesn’t have any other information, making it impossible for a single party to train a model on its own. Furthermore, in some vertically-partitioned scenarios, cleaning data is not an option due to privacy-related reasons and it is not clear how to apply foundation models. Finally, I will also go over a few other requirements that are often overlooked, such as unlearning of data and its implications for the lifecycle management of FL and systems based on foundation models.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信