Outsourcing Training without Uploading Data via Efficient Collaborative Open-Source Sampling.

Junyuan Hong, Lingjuan Lyu, Jiayu Zhou, Michael Spranger
{"title":"Outsourcing Training without Uploading Data via Efficient Collaborative Open-Source Sampling.","authors":"Junyuan Hong,&nbsp;Lingjuan Lyu,&nbsp;Jiayu Zhou,&nbsp;Michael Spranger","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>As deep learning blooms with growing demand for computation and data resources, outsourcing model training to a powerful cloud server becomes an attractive alternative to training at a low-power and cost-effective end device. Traditional outsourcing requires uploading device data to the cloud server, which can be infeasible in many real-world applications due to the often sensitive nature of the collected data and the limited communication bandwidth. To tackle these challenges, we propose to leverage widely available <i>open-source data</i>, which is a massive dataset collected from public and heterogeneous sources (e.g., Internet images). We develop a novel strategy called Efficient Collaborative Open-source Sampling (ECOS) to construct a proximal proxy dataset from open-source data for cloud training, in lieu of client data. ECOS probes open-source data on the cloud server to sense the distribution of client data via a communication- and computation-efficient sampling process, which only communicates a few compressed public features and client scalar responses. Extensive empirical studies show that the proposed ECOS improves the quality of automated client labeling, model compression, and label outsourcing when applied in various learning scenarios.</p>","PeriodicalId":72099,"journal":{"name":"Advances in neural information processing systems","volume":"35 ","pages":"20133-20146"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10157828/pdf/nihms-1888095.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in neural information processing systems","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

As deep learning blooms with growing demand for computation and data resources, outsourcing model training to a powerful cloud server becomes an attractive alternative to training at a low-power and cost-effective end device. Traditional outsourcing requires uploading device data to the cloud server, which can be infeasible in many real-world applications due to the often sensitive nature of the collected data and the limited communication bandwidth. To tackle these challenges, we propose to leverage widely available open-source data, which is a massive dataset collected from public and heterogeneous sources (e.g., Internet images). We develop a novel strategy called Efficient Collaborative Open-source Sampling (ECOS) to construct a proximal proxy dataset from open-source data for cloud training, in lieu of client data. ECOS probes open-source data on the cloud server to sense the distribution of client data via a communication- and computation-efficient sampling process, which only communicates a few compressed public features and client scalar responses. Extensive empirical studies show that the proposed ECOS improves the quality of automated client labeling, model compression, and label outsourcing when applied in various learning scenarios.

通过高效的协作开源采样,不上传数据的外包培训。
随着深度学习对计算和数据资源的需求不断增长,将模型培训外包给功能强大的云服务器成为在低功耗和成本效益的终端设备上进行培训的有吸引力的替代方案。传统的外包需要将设备数据上传到云服务器,由于所收集数据的敏感性和有限的通信带宽,这在许多实际应用中是不可行的。为了应对这些挑战,我们建议利用广泛可用的开源数据,这是一个从公共和异构来源(例如,互联网图像)收集的大量数据集。我们开发了一种称为高效协作开源采样(ECOS)的新策略,从开源数据构建用于云训练的近端代理数据集,以代替客户端数据。ECOS探测云服务器上的开源数据,通过通信和计算效率高的采样过程来感知客户端数据的分布,该采样过程只通信一些压缩的公共特征和客户端标量响应。大量的实证研究表明,当应用于各种学习场景时,所提出的ECOS提高了自动客户端标记、模型压缩和标签外包的质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信