{"title":"Investigation of the optimal number of clusters by the adaptive EM algorithm","authors":"V. Novoselac","doi":"10.17535/CRORR.2019.0001","DOIUrl":null,"url":null,"abstract":"This paper considers the investigation of the optimal number of clusters for datasets that are modeled as the Gaussian mixture. For that purpose, the adaptive method that is based on the modified Expectation Maximization (EM) algorithm is developed. The modification is conducted within the hidden variable of the standard EM algorithm. Assuming that data are multivariate normally distributed, where each component of the Gaussian mixture corresponds to one cluster, the modification is provided by utilizing the fact that the Mahalanobis distance of samples follows a Chi-square distribution. Besides, the quantity measure is constructed in order to determine number of clusters. The proposed method is presented in several numerical examples.","PeriodicalId":44065,"journal":{"name":"Croatian Operational Research Review","volume":" ","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2019-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.17535/CRORR.2019.0001","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Croatian Operational Research Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17535/CRORR.2019.0001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper considers the investigation of the optimal number of clusters for datasets that are modeled as the Gaussian mixture. For that purpose, the adaptive method that is based on the modified Expectation Maximization (EM) algorithm is developed. The modification is conducted within the hidden variable of the standard EM algorithm. Assuming that data are multivariate normally distributed, where each component of the Gaussian mixture corresponds to one cluster, the modification is provided by utilizing the fact that the Mahalanobis distance of samples follows a Chi-square distribution. Besides, the quantity measure is constructed in order to determine number of clusters. The proposed method is presented in several numerical examples.
期刊介绍:
Croatian Operational Research Review (CRORR) is the journal which publishes original scientific papers from the area of operational research. The purpose is to publish papers from various aspects of operational research (OR) with the aim of presenting scientific ideas that will contribute both to theoretical development and practical application of OR. The scope of the journal covers the following subject areas: linear and non-linear programming, integer programing, combinatorial and discrete optimization, multi-objective programming, stohastic models and optimization, scheduling, macroeconomics, economic theory, game theory, statistics and econometrics, marketing and data analysis, information and decision support systems, banking, finance, insurance, environment, energy, health, neural networks and fuzzy systems, control theory, simulation, practical OR and applications. The audience includes both researchers and practitioners from the area of operations research, applied mathematics, statistics, econometrics, intelligent methods, simulation, and other areas included in the above list of topics. The journal has an international board of editors, consisting of more than 30 editors – university professors from Croatia, Slovenia, USA, Italy, Germany, Austria and other coutries.