{"title":"基于粗糙遗传算法的大数据规则提取","authors":"G. Chakraborty, B. Chakraborty","doi":"10.1109/CIMSA.2004.1397237","DOIUrl":null,"url":null,"abstract":"The process of knowledge discovery from vast real life data is encountered with varieties of problems like, presence of noise and outliers in the data set, selection of proper subset of attributes (features) from a large number of relevant and irrelevant attributes, fuzzification or discretization of real-valued data, and finally rule induction. In this proposal, the process of rule creation has two steps. The first step consists of attribute selection, which is based on rough set theory. The next phase is to explore optimal set of simple yet accurate rules. This is accomplished by genetic algorithm. Here, the contribution is how to set the fitness of chromosomes so that simplicity-accuracy tradeoff is accomplished. Finally, chromosomes are coalesced to further simplify and reduce the number of rules.","PeriodicalId":102405,"journal":{"name":"2004 IEEE International Conference onComputational Intelligence for Measurement Systems and Applications, 2004. CIMSA.","volume":"117 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A rough-GA hybrid algorithm for rule extraction from large data\",\"authors\":\"G. Chakraborty, B. Chakraborty\",\"doi\":\"10.1109/CIMSA.2004.1397237\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The process of knowledge discovery from vast real life data is encountered with varieties of problems like, presence of noise and outliers in the data set, selection of proper subset of attributes (features) from a large number of relevant and irrelevant attributes, fuzzification or discretization of real-valued data, and finally rule induction. In this proposal, the process of rule creation has two steps. The first step consists of attribute selection, which is based on rough set theory. The next phase is to explore optimal set of simple yet accurate rules. This is accomplished by genetic algorithm. Here, the contribution is how to set the fitness of chromosomes so that simplicity-accuracy tradeoff is accomplished. Finally, chromosomes are coalesced to further simplify and reduce the number of rules.\",\"PeriodicalId\":102405,\"journal\":{\"name\":\"2004 IEEE International Conference onComputational Intelligence for Measurement Systems and Applications, 2004. CIMSA.\",\"volume\":\"117 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2004 IEEE International Conference onComputational Intelligence for Measurement Systems and Applications, 2004. CIMSA.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIMSA.2004.1397237\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2004 IEEE International Conference onComputational Intelligence for Measurement Systems and Applications, 2004. CIMSA.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIMSA.2004.1397237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A rough-GA hybrid algorithm for rule extraction from large data
The process of knowledge discovery from vast real life data is encountered with varieties of problems like, presence of noise and outliers in the data set, selection of proper subset of attributes (features) from a large number of relevant and irrelevant attributes, fuzzification or discretization of real-valued data, and finally rule induction. In this proposal, the process of rule creation has two steps. The first step consists of attribute selection, which is based on rough set theory. The next phase is to explore optimal set of simple yet accurate rules. This is accomplished by genetic algorithm. Here, the contribution is how to set the fitness of chromosomes so that simplicity-accuracy tradeoff is accomplished. Finally, chromosomes are coalesced to further simplify and reduce the number of rules.