{"title":"利用粗糙集理论对信息属性进行聚类和分类","authors":"R. Nayak, Debahuti Mishra, Satyabrata Das, Kailash Shaw, Sashikala Mishra, Ramamani Tripathy","doi":"10.1145/2345396.2345416","DOIUrl":null,"url":null,"abstract":"Clustering techniques are the unsupervised data mining applications and are important in data mining methods for exploring natural structure and identifying interesting patterns in original data, also it is proved to be helpful in finding coexpressed samples. In cluster analysis, generally the given dataset is partitioned into groups based on the given features such that the data objects in the same group are more similar to each other than the data objects in other groups. The objects are clustered or grouped based on the principle of maximizing intra-class similarity and minimizing interclass similarity. In this paper, the rough set theory (RST) has been used for attribute clustering. RST is a theory adopted to deal with rough and unsure knowledge, which analyzes the clusters and finds the data principles when previous knowledge is not available, providing a new method for data classification. With the continuous change in data objects we have to improve these relevant technologies over time, and we have to propose creative theory in response, meeting the demands of application, though there are many rough set methods. In this paper; after implementing the rough set based attribute clustering method on real life leukemia dataset, we classify them using some of the traditional classification techniques such as Multilayered Perceptron (MLP) based classifier, Naïve Bayesian (NB) classifier and Support Vector Machine (SVM). At the end, the same classification techniques are applied to classify the original leukemia dataset before application of rough set based attribute clustering. Finally the paper provides a comparative analysis among the traditional classifiers and the proposed corresponding rough set based classifiers. Among all, the proposed MLP classifier is found to be the better classifier than the others giving higher classification accuracy and it is proved to be efficient having lower error ratio.","PeriodicalId":290400,"journal":{"name":"International Conference on Advances in Computing, Communications and Informatics","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Clustering and classifying informative attributes using rough set theory\",\"authors\":\"R. Nayak, Debahuti Mishra, Satyabrata Das, Kailash Shaw, Sashikala Mishra, Ramamani Tripathy\",\"doi\":\"10.1145/2345396.2345416\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering techniques are the unsupervised data mining applications and are important in data mining methods for exploring natural structure and identifying interesting patterns in original data, also it is proved to be helpful in finding coexpressed samples. In cluster analysis, generally the given dataset is partitioned into groups based on the given features such that the data objects in the same group are more similar to each other than the data objects in other groups. The objects are clustered or grouped based on the principle of maximizing intra-class similarity and minimizing interclass similarity. In this paper, the rough set theory (RST) has been used for attribute clustering. RST is a theory adopted to deal with rough and unsure knowledge, which analyzes the clusters and finds the data principles when previous knowledge is not available, providing a new method for data classification. With the continuous change in data objects we have to improve these relevant technologies over time, and we have to propose creative theory in response, meeting the demands of application, though there are many rough set methods. In this paper; after implementing the rough set based attribute clustering method on real life leukemia dataset, we classify them using some of the traditional classification techniques such as Multilayered Perceptron (MLP) based classifier, Naïve Bayesian (NB) classifier and Support Vector Machine (SVM). At the end, the same classification techniques are applied to classify the original leukemia dataset before application of rough set based attribute clustering. Finally the paper provides a comparative analysis among the traditional classifiers and the proposed corresponding rough set based classifiers. Among all, the proposed MLP classifier is found to be the better classifier than the others giving higher classification accuracy and it is proved to be efficient having lower error ratio.\",\"PeriodicalId\":290400,\"journal\":{\"name\":\"International Conference on Advances in Computing, Communications and Informatics\",\"volume\":\"79 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Advances in Computing, Communications and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2345396.2345416\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Advances in Computing, Communications and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2345396.2345416","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Clustering and classifying informative attributes using rough set theory
Clustering techniques are the unsupervised data mining applications and are important in data mining methods for exploring natural structure and identifying interesting patterns in original data, also it is proved to be helpful in finding coexpressed samples. In cluster analysis, generally the given dataset is partitioned into groups based on the given features such that the data objects in the same group are more similar to each other than the data objects in other groups. The objects are clustered or grouped based on the principle of maximizing intra-class similarity and minimizing interclass similarity. In this paper, the rough set theory (RST) has been used for attribute clustering. RST is a theory adopted to deal with rough and unsure knowledge, which analyzes the clusters and finds the data principles when previous knowledge is not available, providing a new method for data classification. With the continuous change in data objects we have to improve these relevant technologies over time, and we have to propose creative theory in response, meeting the demands of application, though there are many rough set methods. In this paper; after implementing the rough set based attribute clustering method on real life leukemia dataset, we classify them using some of the traditional classification techniques such as Multilayered Perceptron (MLP) based classifier, Naïve Bayesian (NB) classifier and Support Vector Machine (SVM). At the end, the same classification techniques are applied to classify the original leukemia dataset before application of rough set based attribute clustering. Finally the paper provides a comparative analysis among the traditional classifiers and the proposed corresponding rough set based classifiers. Among all, the proposed MLP classifier is found to be the better classifier than the others giving higher classification accuracy and it is proved to be efficient having lower error ratio.