{"title":"高斯混合模型遗传算法在数据流聚类分析中的应用","authors":"M. Gao, Chan Tai-hua, Xiang-xiang Gao","doi":"10.1109/ICICISYS.2010.5658322","DOIUrl":null,"url":null,"abstract":"Data stream is infinite data and quick stream speed, so traditional clustering algorithm can not be applied to data stream clustering directly. As an efficient tool for data analysis, Gaussian mixture model has been widely applied in the fields of signal and information processing. We can use Gaussian mixture model (GMM) simulate arbitrary clustering graphics. There are two critical problems for the clustering analysis technology to select the appropriate value of number of clusters and partition overlapping clusters. Base on an extending method of Gaussian mixture modeling, a new feature mining method named Gaussian Mixture Model with Genetic Algorithms is proposed in this paper. This method is use a probability density based data stream clustering which requires only the newly arrived data, not the entire historical data, and also can choose optimal estimation clusters number value. The algorithm can determine the number of Gaussian clusters and the parameters of each Gaussian through random split and merge operation of Genetic Algorithms. We can get the accurate information each attribute characteristic describe. So that can make an effective date stream mining.","PeriodicalId":339711,"journal":{"name":"2010 IEEE International Conference on Intelligent Computing and Intelligent Systems","volume":"15 11","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Application of Gaussian Mixture Model Genetic Algorithm in data stream clustering analysis\",\"authors\":\"M. Gao, Chan Tai-hua, Xiang-xiang Gao\",\"doi\":\"10.1109/ICICISYS.2010.5658322\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data stream is infinite data and quick stream speed, so traditional clustering algorithm can not be applied to data stream clustering directly. As an efficient tool for data analysis, Gaussian mixture model has been widely applied in the fields of signal and information processing. We can use Gaussian mixture model (GMM) simulate arbitrary clustering graphics. There are two critical problems for the clustering analysis technology to select the appropriate value of number of clusters and partition overlapping clusters. Base on an extending method of Gaussian mixture modeling, a new feature mining method named Gaussian Mixture Model with Genetic Algorithms is proposed in this paper. This method is use a probability density based data stream clustering which requires only the newly arrived data, not the entire historical data, and also can choose optimal estimation clusters number value. The algorithm can determine the number of Gaussian clusters and the parameters of each Gaussian through random split and merge operation of Genetic Algorithms. We can get the accurate information each attribute characteristic describe. So that can make an effective date stream mining.\",\"PeriodicalId\":339711,\"journal\":{\"name\":\"2010 IEEE International Conference on Intelligent Computing and Intelligent Systems\",\"volume\":\"15 11\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE International Conference on Intelligent Computing and Intelligent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICISYS.2010.5658322\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Intelligent Computing and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICISYS.2010.5658322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Application of Gaussian Mixture Model Genetic Algorithm in data stream clustering analysis
Data stream is infinite data and quick stream speed, so traditional clustering algorithm can not be applied to data stream clustering directly. As an efficient tool for data analysis, Gaussian mixture model has been widely applied in the fields of signal and information processing. We can use Gaussian mixture model (GMM) simulate arbitrary clustering graphics. There are two critical problems for the clustering analysis technology to select the appropriate value of number of clusters and partition overlapping clusters. Base on an extending method of Gaussian mixture modeling, a new feature mining method named Gaussian Mixture Model with Genetic Algorithms is proposed in this paper. This method is use a probability density based data stream clustering which requires only the newly arrived data, not the entire historical data, and also can choose optimal estimation clusters number value. The algorithm can determine the number of Gaussian clusters and the parameters of each Gaussian through random split and merge operation of Genetic Algorithms. We can get the accurate information each attribute characteristic describe. So that can make an effective date stream mining.