使用潜在代码的纯非iid数据的分布式学习

Anirudh Kasturi, A. Agrawal, C. Hota
{"title":"使用潜在代码的纯非iid数据的分布式学习","authors":"Anirudh Kasturi, A. Agrawal, C. Hota","doi":"10.1109/IMCOM56909.2023.10035595","DOIUrl":null,"url":null,"abstract":"There has been a huge increase in the amount of data being generated as a result of the proliferation of high-tech, data-generating devices made possible by recent developments in mobile technology. This has rekindled interest in creating smart applications that can make use of the possibilities of this data and provide insightful results. Concerns about bandwidth, privacy, and latency arise when this data from many devices is aggregated in one location to create more precise predictions. This research presents a novel distributed learning approach, wherein a Variational Auto Encoder is trained locally on each client and then used to derive a sample set of points centrally. The server then develops a unified global model, and sends its training parameters to all users. Pure non-i.i.d. distributions, in which each client only sees data labelled with a single value, are the primary focus of our study. According to our findings, communication amongst the server and the clients takes significantly less time than it does in federated and centralised learning setups. We further demonstrate that, whenever the data is spread in a pure non-iid fashion, our methodology achieves higher accuracy than the federated learning strategy by more than 4%. We also showed that, in comparison to centralised and federated learning systems, our suggested method requires less network bandwidth.","PeriodicalId":230213,"journal":{"name":"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distributed Learning of Pure Non-IID data using Latent Codes\",\"authors\":\"Anirudh Kasturi, A. Agrawal, C. Hota\",\"doi\":\"10.1109/IMCOM56909.2023.10035595\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There has been a huge increase in the amount of data being generated as a result of the proliferation of high-tech, data-generating devices made possible by recent developments in mobile technology. This has rekindled interest in creating smart applications that can make use of the possibilities of this data and provide insightful results. Concerns about bandwidth, privacy, and latency arise when this data from many devices is aggregated in one location to create more precise predictions. This research presents a novel distributed learning approach, wherein a Variational Auto Encoder is trained locally on each client and then used to derive a sample set of points centrally. The server then develops a unified global model, and sends its training parameters to all users. Pure non-i.i.d. distributions, in which each client only sees data labelled with a single value, are the primary focus of our study. According to our findings, communication amongst the server and the clients takes significantly less time than it does in federated and centralised learning setups. We further demonstrate that, whenever the data is spread in a pure non-iid fashion, our methodology achieves higher accuracy than the federated learning strategy by more than 4%. We also showed that, in comparison to centralised and federated learning systems, our suggested method requires less network bandwidth.\",\"PeriodicalId\":230213,\"journal\":{\"name\":\"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IMCOM56909.2023.10035595\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCOM56909.2023.10035595","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

由于最近移动技术的发展使高科技数据生成设备的扩散成为可能,因此产生的数据量大幅增加。这重新激发了人们对创建智能应用程序的兴趣,这些应用程序可以利用这些数据的可能性并提供有洞察力的结果。当来自许多设备的数据聚集在一个位置以创建更精确的预测时,就会出现带宽、隐私和延迟问题。本研究提出了一种新颖的分布式学习方法,其中变分自动编码器在每个客户端进行局部训练,然后用于集中派生点的样本集。然后,服务器开发一个统一的全局模型,并将其训练参数发送给所有用户。纯non-i.i.d。分布是我们研究的主要焦点,在分布中,每个客户只看到标有单个值的数据。根据我们的发现,服务器和客户端之间的通信比联邦和集中式学习设置所花费的时间要少得多。我们进一步证明,当数据以纯非id方式传播时,我们的方法比联邦学习策略的准确率高出4%以上。我们还表明,与集中式和联邦式学习系统相比,我们建议的方法需要更少的网络带宽。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Distributed Learning of Pure Non-IID data using Latent Codes
There has been a huge increase in the amount of data being generated as a result of the proliferation of high-tech, data-generating devices made possible by recent developments in mobile technology. This has rekindled interest in creating smart applications that can make use of the possibilities of this data and provide insightful results. Concerns about bandwidth, privacy, and latency arise when this data from many devices is aggregated in one location to create more precise predictions. This research presents a novel distributed learning approach, wherein a Variational Auto Encoder is trained locally on each client and then used to derive a sample set of points centrally. The server then develops a unified global model, and sends its training parameters to all users. Pure non-i.i.d. distributions, in which each client only sees data labelled with a single value, are the primary focus of our study. According to our findings, communication amongst the server and the clients takes significantly less time than it does in federated and centralised learning setups. We further demonstrate that, whenever the data is spread in a pure non-iid fashion, our methodology achieves higher accuracy than the federated learning strategy by more than 4%. We also showed that, in comparison to centralised and federated learning systems, our suggested method requires less network bandwidth.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信