Analysis and Optimization of Clustering-based Privacy Preservation using Machine Learning

C. Sowmyarani, L. G. Namya, G. K. Nidhi, P. Ramakanth Kumar
{"title":"Analysis and Optimization of Clustering-based Privacy Preservation using Machine Learning","authors":"C. Sowmyarani, L. G. Namya, G. K. Nidhi, P. Ramakanth Kumar","doi":"10.1109/ICONAT57137.2023.10080207","DOIUrl":null,"url":null,"abstract":"Over the years, there has been an increase in the influx of data collected. With the advancements in Machine Learning, Deep Learning and Data Visualization techniques, leveraging these to perform predictive analysis to make data-driven decisions has risen to importance. In order to utilize the collected data to its maximum potential, it needs to be published and made available to a wider audience. This may pose a breach of privacy for the subjects of the data. In order to curb such a breach, it is necessary to anonymize the dataset.This work uses the SAC Algorithm [1] to anonymize the dataset acquired. A Predictive Analysis has been carried out by choosing the model most appropriate for the given input data. Further, a Comparative Analysis between the results obtained from Private and Anonymized data is done to study how anonymization may affect overall data analysis. This work also puts forth a technique leveraging Machine Learning that can perform optimal grouping of records to minimize the loss of data quality during anonymization.","PeriodicalId":250587,"journal":{"name":"2023 International Conference for Advancement in Technology (ICONAT)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference for Advancement in Technology (ICONAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICONAT57137.2023.10080207","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Over the years, there has been an increase in the influx of data collected. With the advancements in Machine Learning, Deep Learning and Data Visualization techniques, leveraging these to perform predictive analysis to make data-driven decisions has risen to importance. In order to utilize the collected data to its maximum potential, it needs to be published and made available to a wider audience. This may pose a breach of privacy for the subjects of the data. In order to curb such a breach, it is necessary to anonymize the dataset.This work uses the SAC Algorithm [1] to anonymize the dataset acquired. A Predictive Analysis has been carried out by choosing the model most appropriate for the given input data. Further, a Comparative Analysis between the results obtained from Private and Anonymized data is done to study how anonymization may affect overall data analysis. This work also puts forth a technique leveraging Machine Learning that can perform optimal grouping of records to minimize the loss of data quality during anonymization.
基于机器学习的聚类隐私保护分析与优化
多年来,收集到的数据不断增加。随着机器学习、深度学习和数据可视化技术的进步,利用这些技术进行预测分析以做出数据驱动的决策变得越来越重要。为了最大限度地利用收集到的数据,需要发表这些数据,并向更广泛的受众提供。这可能会对数据主体的隐私构成侵犯。为了遏制这种泄露,有必要对数据集进行匿名化。本工作使用SAC算法[1]对获取的数据集进行匿名化处理。通过选择最适合给定输入数据的模型进行预测分析。此外,对从私有和匿名数据中获得的结果进行了比较分析,以研究匿名化如何影响整体数据分析。这项工作还提出了一种利用机器学习的技术,可以对记录进行最佳分组,以最大限度地减少匿名化过程中数据质量的损失。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信