{"title":"Balanced longitudinal data clustering with a copula kernel mixture model","authors":"Xi Zhang, Orla A. Murphy, Paul D. McNicholas","doi":"10.1002/cjs.11838","DOIUrl":null,"url":null,"abstract":"<p>Many common clustering methods cannot be used for clustering balanced multivariate longitudinal data in cases where the covariance of variables is a function of the time points. In this article, a copula kernel mixture model (CKMM) is proposed for clustering data of this type. The CKMM is a finite mixture model that decomposes each mixture component's joint density function into a copula and marginal distribution functions. In this decomposition, the Gaussian copula is used due to its mathematical tractability and Gaussian kernel functions are used to estimate the marginal distributions. A generalized expectation-maximization algorithm is used to estimate the model parameters. The performance of the proposed model is assessed in a simulation study and on two real datasets. The proposed model is shown to have effective performance in comparison with standard methods, such as <span></span><math>\n <mrow>\n <mi>K</mi>\n </mrow></math>-means with dynamic time warping clustering, latent growth models and functional high-dimensional data clustering.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 1","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11838","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian Journal of Statistics-Revue Canadienne De Statistique","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cjs.11838","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
Many common clustering methods cannot be used for clustering balanced multivariate longitudinal data in cases where the covariance of variables is a function of the time points. In this article, a copula kernel mixture model (CKMM) is proposed for clustering data of this type. The CKMM is a finite mixture model that decomposes each mixture component's joint density function into a copula and marginal distribution functions. In this decomposition, the Gaussian copula is used due to its mathematical tractability and Gaussian kernel functions are used to estimate the marginal distributions. A generalized expectation-maximization algorithm is used to estimate the model parameters. The performance of the proposed model is assessed in a simulation study and on two real datasets. The proposed model is shown to have effective performance in comparison with standard methods, such as -means with dynamic time warping clustering, latent growth models and functional high-dimensional data clustering.
期刊介绍:
The Canadian Journal of Statistics is the official journal of the Statistical Society of Canada. It has a reputation internationally as an excellent journal. The editorial board is comprised of statistical scientists with applied, computational, methodological, theoretical and probabilistic interests. Their role is to ensure that the journal continues to provide an international forum for the discipline of Statistics.
The journal seeks papers making broad points of interest to many readers, whereas papers making important points of more specific interest are better placed in more specialized journals. The levels of innovation and impact are key in the evaluation of submitted manuscripts.