大数据中确定数据相关性的联邦学习方法

Ronald Doku, D. Rawat, Chunmei Liu
{"title":"大数据中确定数据相关性的联邦学习方法","authors":"Ronald Doku, D. Rawat, Chunmei Liu","doi":"10.1109/IRI.2019.00039","DOIUrl":null,"url":null,"abstract":"In the past few years, data has proliferated to astronomical proportions; as a result, big data has become the driving force behind the growth of many machine learning innovations. However, the incessant generation of data in the information age poses a needle in the haystack problem, where it has become challenging to determine useful data from a heap of irrelevant ones. This has resulted in a quality over quantity issue in data science where a lot of data is being generated, but the majority of it is irrelevant. Furthermore, most of the data and the resources needed to effectively train machine learning models are owned by major tech companies, resulting in a centralization problem. As such, federated learning seeks to transform how machine learning models are trained by adopting a distributed machine learning approach. Another promising technology is the blockchain, whose immutable nature ensures data integrity. By combining the blockchain's trust mechanism and federated learning's ability to disrupt data centralization, we propose an approach that determines relevant data and stores the data in a decentralized manner.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"Towards Federated Learning Approach to Determine Data Relevance in Big Data\",\"authors\":\"Ronald Doku, D. Rawat, Chunmei Liu\",\"doi\":\"10.1109/IRI.2019.00039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the past few years, data has proliferated to astronomical proportions; as a result, big data has become the driving force behind the growth of many machine learning innovations. However, the incessant generation of data in the information age poses a needle in the haystack problem, where it has become challenging to determine useful data from a heap of irrelevant ones. This has resulted in a quality over quantity issue in data science where a lot of data is being generated, but the majority of it is irrelevant. Furthermore, most of the data and the resources needed to effectively train machine learning models are owned by major tech companies, resulting in a centralization problem. As such, federated learning seeks to transform how machine learning models are trained by adopting a distributed machine learning approach. Another promising technology is the blockchain, whose immutable nature ensures data integrity. By combining the blockchain's trust mechanism and federated learning's ability to disrupt data centralization, we propose an approach that determines relevant data and stores the data in a decentralized manner.\",\"PeriodicalId\":295028,\"journal\":{\"name\":\"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRI.2019.00039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2019.00039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 26

摘要

在过去的几年里,数据激增到天文数字;因此,大数据已经成为许多机器学习创新背后的驱动力。然而,信息时代不断产生的数据构成了大海捞针的问题,从一堆不相关的数据中确定有用的数据已成为一项挑战。这导致了数据科学中质量大于数量的问题,即生成了大量数据,但其中大部分是不相关的。此外,有效训练机器学习模型所需的大部分数据和资源都由大型科技公司拥有,这导致了集中化问题。因此,联邦学习试图通过采用分布式机器学习方法来改变机器学习模型的训练方式。另一项有前途的技术是区块链,其不可变的性质确保了数据的完整性。通过结合区块链的信任机制和联邦学习破坏数据集中化的能力,我们提出了一种确定相关数据并以分散方式存储数据的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Towards Federated Learning Approach to Determine Data Relevance in Big Data
In the past few years, data has proliferated to astronomical proportions; as a result, big data has become the driving force behind the growth of many machine learning innovations. However, the incessant generation of data in the information age poses a needle in the haystack problem, where it has become challenging to determine useful data from a heap of irrelevant ones. This has resulted in a quality over quantity issue in data science where a lot of data is being generated, but the majority of it is irrelevant. Furthermore, most of the data and the resources needed to effectively train machine learning models are owned by major tech companies, resulting in a centralization problem. As such, federated learning seeks to transform how machine learning models are trained by adopting a distributed machine learning approach. Another promising technology is the blockchain, whose immutable nature ensures data integrity. By combining the blockchain's trust mechanism and federated learning's ability to disrupt data centralization, we propose an approach that determines relevant data and stores the data in a decentralized manner.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信