数据匿名化有效性的性能指标评价

A. Raj, Rio G. L. D'Souza
{"title":"数据匿名化有效性的性能指标评价","authors":"A. Raj, Rio G. L. D'Souza","doi":"10.1109/I2CT57861.2023.10126310","DOIUrl":null,"url":null,"abstract":"A supplementary method for ensuring that private data is inaccessible to outside parties is data anonymization. Anonymization might affect the outcomes of data mining procedures since it may make it more difficult for commonly used algorithms to analyze the data. This practical experience report compares the performance impact of current data anonymization algorithms to the suggested k-anonymization methods utilizing both original and anonymized data in order to assess the correctness and execution time. Through the use of kanonymization, l-diversity, t-closeness, and differential privacy techniques, a sample of genuine data produced by a healthcare facility was made anonymous. Contrary to predictions, the Hadoop framework was able to handle anonymization approaches, improving accuracy and performance while speeding up execution. These findings show that data anonymization techniques, when properly implemented through Hadoop ecosystems, can help to increase the effectiveness of data anonymization. Furthermore, the suggested method can produce the data anonymization with the necessary utility and protection trade-offs and with a performance scalable to large datasets.","PeriodicalId":150346,"journal":{"name":"2023 IEEE 8th International Conference for Convergence in Technology (I2CT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Metrics Evaluation Towards The Effectiveness of Data Anonymization\",\"authors\":\"A. Raj, Rio G. L. D'Souza\",\"doi\":\"10.1109/I2CT57861.2023.10126310\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A supplementary method for ensuring that private data is inaccessible to outside parties is data anonymization. Anonymization might affect the outcomes of data mining procedures since it may make it more difficult for commonly used algorithms to analyze the data. This practical experience report compares the performance impact of current data anonymization algorithms to the suggested k-anonymization methods utilizing both original and anonymized data in order to assess the correctness and execution time. Through the use of kanonymization, l-diversity, t-closeness, and differential privacy techniques, a sample of genuine data produced by a healthcare facility was made anonymous. Contrary to predictions, the Hadoop framework was able to handle anonymization approaches, improving accuracy and performance while speeding up execution. These findings show that data anonymization techniques, when properly implemented through Hadoop ecosystems, can help to increase the effectiveness of data anonymization. Furthermore, the suggested method can produce the data anonymization with the necessary utility and protection trade-offs and with a performance scalable to large datasets.\",\"PeriodicalId\":150346,\"journal\":{\"name\":\"2023 IEEE 8th International Conference for Convergence in Technology (I2CT)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 8th International Conference for Convergence in Technology (I2CT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/I2CT57861.2023.10126310\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 8th International Conference for Convergence in Technology (I2CT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I2CT57861.2023.10126310","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

确保私有数据不被外界访问的补充方法是数据匿名化。匿名化可能会影响数据挖掘过程的结果,因为它可能使常用算法更难分析数据。本实践经验报告比较了当前数据匿名化算法与使用原始数据和匿名数据的建议k-匿名化方法的性能影响,以评估其正确性和执行时间。通过使用匿名化、l-多样性、t-接近和差异隐私技术,医疗机构生成的真实数据样本是匿名的。与预测相反,Hadoop框架能够处理匿名化方法,提高准确性和性能,同时加快执行速度。这些发现表明,数据匿名化技术,当通过Hadoop生态系统适当实施时,可以帮助提高数据匿名化的有效性。此外,所建议的方法可以产生具有必要的实用程序和保护权衡的数据匿名化,并且具有可扩展到大型数据集的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Performance Metrics Evaluation Towards The Effectiveness of Data Anonymization
A supplementary method for ensuring that private data is inaccessible to outside parties is data anonymization. Anonymization might affect the outcomes of data mining procedures since it may make it more difficult for commonly used algorithms to analyze the data. This practical experience report compares the performance impact of current data anonymization algorithms to the suggested k-anonymization methods utilizing both original and anonymized data in order to assess the correctness and execution time. Through the use of kanonymization, l-diversity, t-closeness, and differential privacy techniques, a sample of genuine data produced by a healthcare facility was made anonymous. Contrary to predictions, the Hadoop framework was able to handle anonymization approaches, improving accuracy and performance while speeding up execution. These findings show that data anonymization techniques, when properly implemented through Hadoop ecosystems, can help to increase the effectiveness of data anonymization. Furthermore, the suggested method can produce the data anonymization with the necessary utility and protection trade-offs and with a performance scalable to large datasets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信