C. Sowmyarani, L. G. Namya, G. K. Nidhi, P. Ramakanth Kumar
{"title":"Analysis and Optimization of Clustering-based Privacy Preservation using Machine Learning","authors":"C. Sowmyarani, L. G. Namya, G. K. Nidhi, P. Ramakanth Kumar","doi":"10.1109/ICONAT57137.2023.10080207","DOIUrl":null,"url":null,"abstract":"Over the years, there has been an increase in the influx of data collected. With the advancements in Machine Learning, Deep Learning and Data Visualization techniques, leveraging these to perform predictive analysis to make data-driven decisions has risen to importance. In order to utilize the collected data to its maximum potential, it needs to be published and made available to a wider audience. This may pose a breach of privacy for the subjects of the data. In order to curb such a breach, it is necessary to anonymize the dataset.This work uses the SAC Algorithm [1] to anonymize the dataset acquired. A Predictive Analysis has been carried out by choosing the model most appropriate for the given input data. Further, a Comparative Analysis between the results obtained from Private and Anonymized data is done to study how anonymization may affect overall data analysis. This work also puts forth a technique leveraging Machine Learning that can perform optimal grouping of records to minimize the loss of data quality during anonymization.","PeriodicalId":250587,"journal":{"name":"2023 International Conference for Advancement in Technology (ICONAT)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference for Advancement in Technology (ICONAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICONAT57137.2023.10080207","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Over the years, there has been an increase in the influx of data collected. With the advancements in Machine Learning, Deep Learning and Data Visualization techniques, leveraging these to perform predictive analysis to make data-driven decisions has risen to importance. In order to utilize the collected data to its maximum potential, it needs to be published and made available to a wider audience. This may pose a breach of privacy for the subjects of the data. In order to curb such a breach, it is necessary to anonymize the dataset.This work uses the SAC Algorithm [1] to anonymize the dataset acquired. A Predictive Analysis has been carried out by choosing the model most appropriate for the given input data. Further, a Comparative Analysis between the results obtained from Private and Anonymized data is done to study how anonymization may affect overall data analysis. This work also puts forth a technique leveraging Machine Learning that can perform optimal grouping of records to minimize the loss of data quality during anonymization.