{"title":"高维数据投影聚类的概率模型","authors":"Lifei Chen, Q. Jiang, Shengrui Wang","doi":"10.1109/ICDM.2008.15","DOIUrl":null,"url":null,"abstract":"Clustering high dimensional data is a big challenge in data mining due to the curse of dimensionality. To solve this problem, projective clustering has been defined as an extension of traditional clustering that seeks to find projected clusters in subsets of dimensions of a data space. In this paper, the problem of modeling projected clusters is first discussed, and an extended Gaussian model is proposed. Second, a general objective criterion used with k-means type projective clustering is presented based on the model. Finally, the expressions to learn model parameters are derived and then used in a new algorithm named FPC to perform fuzzy clustering on high dimensional data. The experimental results on document clustering show the effectiveness of the proposed clustering model.","PeriodicalId":252958,"journal":{"name":"2008 Eighth IEEE International Conference on Data Mining","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"A Probability Model for Projective Clustering on High Dimensional Data\",\"authors\":\"Lifei Chen, Q. Jiang, Shengrui Wang\",\"doi\":\"10.1109/ICDM.2008.15\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering high dimensional data is a big challenge in data mining due to the curse of dimensionality. To solve this problem, projective clustering has been defined as an extension of traditional clustering that seeks to find projected clusters in subsets of dimensions of a data space. In this paper, the problem of modeling projected clusters is first discussed, and an extended Gaussian model is proposed. Second, a general objective criterion used with k-means type projective clustering is presented based on the model. Finally, the expressions to learn model parameters are derived and then used in a new algorithm named FPC to perform fuzzy clustering on high dimensional data. The experimental results on document clustering show the effectiveness of the proposed clustering model.\",\"PeriodicalId\":252958,\"journal\":{\"name\":\"2008 Eighth IEEE International Conference on Data Mining\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 Eighth IEEE International Conference on Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2008.15\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Eighth IEEE International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2008.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Probability Model for Projective Clustering on High Dimensional Data
Clustering high dimensional data is a big challenge in data mining due to the curse of dimensionality. To solve this problem, projective clustering has been defined as an extension of traditional clustering that seeks to find projected clusters in subsets of dimensions of a data space. In this paper, the problem of modeling projected clusters is first discussed, and an extended Gaussian model is proposed. Second, a general objective criterion used with k-means type projective clustering is presented based on the model. Finally, the expressions to learn model parameters are derived and then used in a new algorithm named FPC to perform fuzzy clustering on high dimensional data. The experimental results on document clustering show the effectiveness of the proposed clustering model.