利用粗糙集理论对信息属性进行聚类和分类

International Conference on Advances in Computing, Communications and Informatics Pub Date : 2012-08-03 DOI:10.1145/2345396.2345416

R. Nayak, Debahuti Mishra, Satyabrata Das, Kailash Shaw, Sashikala Mishra, Ramamani Tripathy

{"title":"利用粗糙集理论对信息属性进行聚类和分类","authors":"R. Nayak, Debahuti Mishra, Satyabrata Das, Kailash Shaw, Sashikala Mishra, Ramamani Tripathy","doi":"10.1145/2345396.2345416","DOIUrl":null,"url":null,"abstract":"Clustering techniques are the unsupervised data mining applications and are important in data mining methods for exploring natural structure and identifying interesting patterns in original data, also it is proved to be helpful in finding coexpressed samples. In cluster analysis, generally the given dataset is partitioned into groups based on the given features such that the data objects in the same group are more similar to each other than the data objects in other groups. The objects are clustered or grouped based on the principle of maximizing intra-class similarity and minimizing interclass similarity. In this paper, the rough set theory (RST) has been used for attribute clustering. RST is a theory adopted to deal with rough and unsure knowledge, which analyzes the clusters and finds the data principles when previous knowledge is not available, providing a new method for data classification. With the continuous change in data objects we have to improve these relevant technologies over time, and we have to propose creative theory in response, meeting the demands of application, though there are many rough set methods. In this paper; after implementing the rough set based attribute clustering method on real life leukemia dataset, we classify them using some of the traditional classification techniques such as Multilayered Perceptron (MLP) based classifier, Naïve Bayesian (NB) classifier and Support Vector Machine (SVM). At the end, the same classification techniques are applied to classify the original leukemia dataset before application of rough set based attribute clustering. Finally the paper provides a comparative analysis among the traditional classifiers and the proposed corresponding rough set based classifiers. Among all, the proposed MLP classifier is found to be the better classifier than the others giving higher classification accuracy and it is proved to be efficient having lower error ratio.","PeriodicalId":290400,"journal":{"name":"International Conference on Advances in Computing, Communications and Informatics","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Clustering and classifying informative attributes using rough set theory\",\"authors\":\"R. Nayak, Debahuti Mishra, Satyabrata Das, Kailash Shaw, Sashikala Mishra, Ramamani Tripathy\",\"doi\":\"10.1145/2345396.2345416\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering techniques are the unsupervised data mining applications and are important in data mining methods for exploring natural structure and identifying interesting patterns in original data, also it is proved to be helpful in finding coexpressed samples. In cluster analysis, generally the given dataset is partitioned into groups based on the given features such that the data objects in the same group are more similar to each other than the data objects in other groups. The objects are clustered or grouped based on the principle of maximizing intra-class similarity and minimizing interclass similarity. In this paper, the rough set theory (RST) has been used for attribute clustering. RST is a theory adopted to deal with rough and unsure knowledge, which analyzes the clusters and finds the data principles when previous knowledge is not available, providing a new method for data classification. With the continuous change in data objects we have to improve these relevant technologies over time, and we have to propose creative theory in response, meeting the demands of application, though there are many rough set methods. In this paper; after implementing the rough set based attribute clustering method on real life leukemia dataset, we classify them using some of the traditional classification techniques such as Multilayered Perceptron (MLP) based classifier, Naïve Bayesian (NB) classifier and Support Vector Machine (SVM). At the end, the same classification techniques are applied to classify the original leukemia dataset before application of rough set based attribute clustering. Finally the paper provides a comparative analysis among the traditional classifiers and the proposed corresponding rough set based classifiers. Among all, the proposed MLP classifier is found to be the better classifier than the others giving higher classification accuracy and it is proved to be efficient having lower error ratio.\",\"PeriodicalId\":290400,\"journal\":{\"name\":\"International Conference on Advances in Computing, Communications and Informatics\",\"volume\":\"79 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Advances in Computing, Communications and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2345396.2345416\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Advances in Computing, Communications and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2345396.2345416","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

聚类技术是无监督数据挖掘的一种应用，在原始数据中探索自然结构和识别有趣模式的数据挖掘方法中占有重要地位，并被证明有助于发现共表达样本。在聚类分析中，通常根据给定的特征将给定的数据集划分为组，使同一组中的数据对象比其他组中的数据对象更相似。根据类内相似性最大化和类间相似性最小化的原则对对象进行聚类或分组。本文将粗糙集理论(RST)用于属性聚类。RST是一种处理粗糙和不确定知识的理论，它在没有先验知识的情况下对聚类进行分析，发现数据原理，为数据分类提供了一种新的方法。随着数据对象的不断变化，我们必须随着时间的推移对这些相关技术进行改进，我们必须提出创造性的理论来应对，以满足应用的需求，尽管粗糙集方法有很多。在本文中;在对真实白血病数据集实现基于粗糙集的属性聚类方法后，使用基于多层感知器(MLP)的分类器、Naïve贝叶斯(NB)分类器和支持向量机(SVM)等传统分类技术对其进行分类。最后，在应用基于粗糙集的属性聚类之前，使用相同的分类技术对原始白血病数据集进行分类。最后，对传统分类器和相应的基于粗糙集的分类器进行了比较分析。其中，所提出的MLP分类器具有较高的分类精度和较低的误差率，是一种较好的分类器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Clustering and classifying informative attributes using rough set theory

Clustering techniques are the unsupervised data mining applications and are important in data mining methods for exploring natural structure and identifying interesting patterns in original data, also it is proved to be helpful in finding coexpressed samples. In cluster analysis, generally the given dataset is partitioned into groups based on the given features such that the data objects in the same group are more similar to each other than the data objects in other groups. The objects are clustered or grouped based on the principle of maximizing intra-class similarity and minimizing interclass similarity. In this paper, the rough set theory (RST) has been used for attribute clustering. RST is a theory adopted to deal with rough and unsure knowledge, which analyzes the clusters and finds the data principles when previous knowledge is not available, providing a new method for data classification. With the continuous change in data objects we have to improve these relevant technologies over time, and we have to propose creative theory in response, meeting the demands of application, though there are many rough set methods. In this paper; after implementing the rough set based attribute clustering method on real life leukemia dataset, we classify them using some of the traditional classification techniques such as Multilayered Perceptron (MLP) based classifier, Naïve Bayesian (NB) classifier and Support Vector Machine (SVM). At the end, the same classification techniques are applied to classify the original leukemia dataset before application of rough set based attribute clustering. Finally the paper provides a comparative analysis among the traditional classifiers and the proposed corresponding rough set based classifiers. Among all, the proposed MLP classifier is found to be the better classifier than the others giving higher classification accuracy and it is proved to be efficient having lower error ratio.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Advances in Computing, Communications and Informatics

自引率

0.00%

发文量