Outsourcing Training without Uploading Data via Efficient Collaborative Open-Source Sampling

Junyuan Hong, Lingjuan Lyu, Jiayu Zhou, M. Spranger
{"title":"Outsourcing Training without Uploading Data via Efficient Collaborative Open-Source Sampling","authors":"Junyuan Hong, Lingjuan Lyu, Jiayu Zhou, M. Spranger","doi":"10.48550/arXiv.2210.12575","DOIUrl":null,"url":null,"abstract":"As deep learning blooms with growing demand for computation and data resources, outsourcing model training to a powerful cloud server becomes an attractive alternative to training at a low-power and cost-effective end device. Traditional outsourcing requires uploading device data to the cloud server, which can be infeasible in many real-world applications due to the often sensitive nature of the collected data and the limited communication bandwidth. To tackle these challenges, we propose to leverage widely available open-source data, which is a massive dataset collected from public and heterogeneous sources (e.g., Internet images). We develop a novel strategy called Efficient Collaborative Open-source Sampling (ECOS) to construct a proximal proxy dataset from open-source data for cloud training, in lieu of client data. ECOS probes open-source data on the cloud server to sense the distribution of client data via a communication- and computation-efficient sampling process, which only communicates a few compressed public features and client scalar responses. Extensive empirical studies show that the proposed ECOS improves the quality of automated client labeling, model compression, and label outsourcing when applied in various learning scenarios.","PeriodicalId":72099,"journal":{"name":"Advances in neural information processing systems","volume":"35 1","pages":"20133-20146"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in neural information processing systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2210.12575","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

As deep learning blooms with growing demand for computation and data resources, outsourcing model training to a powerful cloud server becomes an attractive alternative to training at a low-power and cost-effective end device. Traditional outsourcing requires uploading device data to the cloud server, which can be infeasible in many real-world applications due to the often sensitive nature of the collected data and the limited communication bandwidth. To tackle these challenges, we propose to leverage widely available open-source data, which is a massive dataset collected from public and heterogeneous sources (e.g., Internet images). We develop a novel strategy called Efficient Collaborative Open-source Sampling (ECOS) to construct a proximal proxy dataset from open-source data for cloud training, in lieu of client data. ECOS probes open-source data on the cloud server to sense the distribution of client data via a communication- and computation-efficient sampling process, which only communicates a few compressed public features and client scalar responses. Extensive empirical studies show that the proposed ECOS improves the quality of automated client labeling, model compression, and label outsourcing when applied in various learning scenarios.
外包培训无需通过高效的协作开源采样上传数据
随着深度学习随着对计算和数据资源的需求不断增长而蓬勃发展,将模型培训外包给功能强大的云服务器成为在低功耗、成本效益高的终端设备进行培训的一种有吸引力的替代方案。传统的外包需要将设备数据上传到云服务器,这在许多现实世界的应用程序中是不可行的,因为收集的数据往往是敏感的,通信带宽有限。为了应对这些挑战,我们建议利用广泛可用的开源数据,这是一个从公共和异构来源(如互联网图像)收集的庞大数据集。我们开发了一种名为高效协作开源采样(ECOS)的新策略,用开源数据代替客户端数据,构建一个用于云训练的近端代理数据集。ECOS在云服务器上探测开源数据,通过通信和计算高效的采样过程来感知客户端数据的分布,该采样过程只传递一些压缩的公共特性和客户端标量响应。大量的实证研究表明,当应用于各种学习场景时,所提出的ECOS提高了自动客户标签、模型压缩和标签外包的质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信