FedChain:使用区块链和联邦学习构建人工智能模型的协作框架

2021 8th NAFOSTED Conference on Information and Computer Science (NICS) Pub Date : 2021-12-21 DOI:10.1109/NICS54270.2021.9701450

T. D. Luong, Vuong Minh Tien, Hoang Anh, Ngan Van Luyen, Nguyen Chi Vy, Phan The Duy, V. Pham

{"title":"FedChain:使用区块链和联邦学习构建人工智能模型的协作框架","authors":"T. D. Luong, Vuong Minh Tien, Hoang Anh, Ngan Van Luyen, Nguyen Chi Vy, Phan The Duy, V. Pham","doi":"10.1109/NICS54270.2021.9701450","DOIUrl":null,"url":null,"abstract":"Machine learning (ML) has been drawn to attention from both academia and industry thanks to outstanding advances and its potential in many fields. Nevertheless, data collection for training models is a difficult task since there are many concerns on privacy and data breach reported recently. Data owners or holders are usually hesitant to share their private data. Also, the benefits from analyzing user data are not distributed to users. In addition, due to the lack of incentive mechanism for sharing data, ML builders cannot leverage the massive data from many sources. Thus, this paper introduces a collaborative approach for building artificial intelligence (AI) models, named FedChain to encourage many data owners to cooperate in the training phase without sharing their raw data. It helps data holders ensure privacy preservation for the collaborative training right on their premises, while reducing the computation load in case of centralized training. More specifically, we utilize federated learning (FL)and Hyperledger Sawtooth Blockchain to set up a prototype framework that enables many parties to join, contribute and receive rewards transparently from their training task results. Finally, we conduct experiments of our FedChain on cyber threat intelligence context, where AI model is trained within many organizations on each their private datastore, and then it is used for detecting malicious actions in the network. Experimental results with the CICIDS-2017 dataset prove that the FL-based strategy can help create effective privacy-preserving ML models while taking advantage of diverse data sources from the community.","PeriodicalId":296963,"journal":{"name":"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FedChain: A Collaborative Framework for Building Artificial Intelligence Models using Blockchain and Federated Learning\",\"authors\":\"T. D. Luong, Vuong Minh Tien, Hoang Anh, Ngan Van Luyen, Nguyen Chi Vy, Phan The Duy, V. Pham\",\"doi\":\"10.1109/NICS54270.2021.9701450\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning (ML) has been drawn to attention from both academia and industry thanks to outstanding advances and its potential in many fields. Nevertheless, data collection for training models is a difficult task since there are many concerns on privacy and data breach reported recently. Data owners or holders are usually hesitant to share their private data. Also, the benefits from analyzing user data are not distributed to users. In addition, due to the lack of incentive mechanism for sharing data, ML builders cannot leverage the massive data from many sources. Thus, this paper introduces a collaborative approach for building artificial intelligence (AI) models, named FedChain to encourage many data owners to cooperate in the training phase without sharing their raw data. It helps data holders ensure privacy preservation for the collaborative training right on their premises, while reducing the computation load in case of centralized training. More specifically, we utilize federated learning (FL)and Hyperledger Sawtooth Blockchain to set up a prototype framework that enables many parties to join, contribute and receive rewards transparently from their training task results. Finally, we conduct experiments of our FedChain on cyber threat intelligence context, where AI model is trained within many organizations on each their private datastore, and then it is used for detecting malicious actions in the network. Experimental results with the CICIDS-2017 dataset prove that the FL-based strategy can help create effective privacy-preserving ML models while taking advantage of diverse data sources from the community.\",\"PeriodicalId\":296963,\"journal\":{\"name\":\"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)\",\"volume\":\"102 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NICS54270.2021.9701450\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NICS54270.2021.9701450","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

机器学习(ML)由于其在许多领域的突出进展和潜力而引起了学术界和工业界的关注。然而，训练模型的数据收集是一项艰巨的任务，因为最近有许多关于隐私和数据泄露的担忧。数据所有者或持有者通常不愿分享他们的私人数据。此外，分析用户数据的好处并没有分配给用户。此外，由于缺乏共享数据的激励机制，机器学习构建者无法利用来自多个来源的海量数据。因此，本文引入了一种用于构建人工智能(AI)模型的协作方法，名为FedChain，以鼓励许多数据所有者在不共享原始数据的情况下在训练阶段进行合作。它可以帮助数据持有者确保在其场所内进行协作训练的隐私保护，同时在集中训练的情况下减少计算负荷。更具体地说，我们利用联邦学习(FL)和Hyperledger Sawtooth区块链建立了一个原型框架，使许多各方能够加入，贡献并从他们的训练任务结果中透明地获得奖励。最后，我们在网络威胁情报环境中对我们的FedChain进行了实验，其中AI模型在许多组织的每个私有数据存储中进行训练，然后用于检测网络中的恶意行为。CICIDS-2017数据集的实验结果证明，基于fl的策略可以帮助创建有效的隐私保护ML模型，同时利用来自社区的各种数据源。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FedChain: A Collaborative Framework for Building Artificial Intelligence Models using Blockchain and Federated Learning

Machine learning (ML) has been drawn to attention from both academia and industry thanks to outstanding advances and its potential in many fields. Nevertheless, data collection for training models is a difficult task since there are many concerns on privacy and data breach reported recently. Data owners or holders are usually hesitant to share their private data. Also, the benefits from analyzing user data are not distributed to users. In addition, due to the lack of incentive mechanism for sharing data, ML builders cannot leverage the massive data from many sources. Thus, this paper introduces a collaborative approach for building artificial intelligence (AI) models, named FedChain to encourage many data owners to cooperate in the training phase without sharing their raw data. It helps data holders ensure privacy preservation for the collaborative training right on their premises, while reducing the computation load in case of centralized training. More specifically, we utilize federated learning (FL)and Hyperledger Sawtooth Blockchain to set up a prototype framework that enables many parties to join, contribute and receive rewards transparently from their training task results. Finally, we conduct experiments of our FedChain on cyber threat intelligence context, where AI model is trained within many organizations on each their private datastore, and then it is used for detecting malicious actions in the network. Experimental results with the CICIDS-2017 dataset prove that the FL-based strategy can help create effective privacy-preserving ML models while taking advantage of diverse data sources from the community.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 8th NAFOSTED Conference on Information and Computer Science (NICS)

自引率

0.00%

发文量