{"title":"High Dimensional Data Processing in Privacy Preserving Data Mining","authors":"M. Rathi, A. Rajavat","doi":"10.1109/CSNT48778.2020.9115771","DOIUrl":null,"url":null,"abstract":"In business intelligence data is an essential feature in decision making. An incomplete or lake of information can damage the entire project ideas. Therefore sometimes different business dimensions are collaborating their sensitive and personal data for enhancing decisional ability. During this, the dataset is significantly growing in dimensions. Therefore it is much intense to find a method by which the higher dimensional data can be handled. This paper contributes two key directions of the PPDM (privacy-preserving data mining), first a survey conducted on the various PPDM models to understand the working and requirements of the PPDM systems. In addition to an experimental comparative study among PCA, k-PCA and Correlation coefficient based feature selection or dimensionality reduction is conducted. On the basis of experimental observations, the PCA and k-PCA feature selection techniques are degrading the classification accuracy as compared to correlation coefficient based classification. Therefore, in further system design and implementation, the correlation coefficient is used to handling a huge quantity of data dimensions.","PeriodicalId":131745,"journal":{"name":"2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSNT48778.2020.9115771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In business intelligence data is an essential feature in decision making. An incomplete or lake of information can damage the entire project ideas. Therefore sometimes different business dimensions are collaborating their sensitive and personal data for enhancing decisional ability. During this, the dataset is significantly growing in dimensions. Therefore it is much intense to find a method by which the higher dimensional data can be handled. This paper contributes two key directions of the PPDM (privacy-preserving data mining), first a survey conducted on the various PPDM models to understand the working and requirements of the PPDM systems. In addition to an experimental comparative study among PCA, k-PCA and Correlation coefficient based feature selection or dimensionality reduction is conducted. On the basis of experimental observations, the PCA and k-PCA feature selection techniques are degrading the classification accuracy as compared to correlation coefficient based classification. Therefore, in further system design and implementation, the correlation coefficient is used to handling a huge quantity of data dimensions.