{"title":"基于数据范围感知播种和聚类期望最大化的聚类分析","authors":"Hongwei Zhu, Honglei Zhu","doi":"10.1109/SITIS.2007.61","DOIUrl":null,"url":null,"abstract":"Expectation maximization (EM) is a local maximization method of the mixture model. When applied to clustering analysis, it generates good results only with reasonably good initialization, which can be produced by hierarchical agglomeration. However, hierarchical agglomeration has poor scalability due to its computational complexity. This paper presents a novel method, called ISOEM, to overcome this limitation. It uses a data range aware seeding algorithm to create an initial classification to initialize an iterative self-organizing process. The process alternates between EM and agglomeration coupled with classification EM. Evaluation using two imagery datasets showed the method had very good performance. The paper also presents the results of using a skewness measure and a separation-cohesion index as indicators for determining the number of clusters in the data.","PeriodicalId":234433,"journal":{"name":"2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Clustering Analysis Using Data Range Aware Seeding and Agglomerative Expectation Maximization\",\"authors\":\"Hongwei Zhu, Honglei Zhu\",\"doi\":\"10.1109/SITIS.2007.61\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Expectation maximization (EM) is a local maximization method of the mixture model. When applied to clustering analysis, it generates good results only with reasonably good initialization, which can be produced by hierarchical agglomeration. However, hierarchical agglomeration has poor scalability due to its computational complexity. This paper presents a novel method, called ISOEM, to overcome this limitation. It uses a data range aware seeding algorithm to create an initial classification to initialize an iterative self-organizing process. The process alternates between EM and agglomeration coupled with classification EM. Evaluation using two imagery datasets showed the method had very good performance. The paper also presents the results of using a skewness measure and a separation-cohesion index as indicators for determining the number of clusters in the data.\",\"PeriodicalId\":234433,\"journal\":{\"name\":\"2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SITIS.2007.61\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SITIS.2007.61","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Clustering Analysis Using Data Range Aware Seeding and Agglomerative Expectation Maximization
Expectation maximization (EM) is a local maximization method of the mixture model. When applied to clustering analysis, it generates good results only with reasonably good initialization, which can be produced by hierarchical agglomeration. However, hierarchical agglomeration has poor scalability due to its computational complexity. This paper presents a novel method, called ISOEM, to overcome this limitation. It uses a data range aware seeding algorithm to create an initial classification to initialize an iterative self-organizing process. The process alternates between EM and agglomeration coupled with classification EM. Evaluation using two imagery datasets showed the method had very good performance. The paper also presents the results of using a skewness measure and a separation-cohesion index as indicators for determining the number of clusters in the data.