Aqi Dong, Volodymyr Melnykov, Yang Wang, Xuwen Zhu
{"title":"Conditional mixture modelling for heavy‐tailed and skewed data","authors":"Aqi Dong, Volodymyr Melnykov, Yang Wang, Xuwen Zhu","doi":"10.1002/sta4.608","DOIUrl":null,"url":null,"abstract":"Overparameterization is a serious concern for multivariate mixture models as it can lead to model overfitting and, as a result, mixture order underestimation. Parsimonious modelling is one of the most effective remedies in this context. In Gaussian mixture models, the majority of parameters is associated with covariance matrices and parsimonious models based on factor analysers and spectral decomposition of dispersion parameters are the most popular in literature. Some drawbacks of these models include the lack of flexibility in imposing different covariance structures for individual components and limitations in modelling compact clusters. Recently introduced conditional mixture models provide substantial flexibility in addressing these concerns. The components of such mixtures are formulated as a product of conditional distributions with univariate Gaussian densities being the primary choice. However, the presence of heavy tails or skewness in any dimension can lead to fitting problems. We propose a flexible model that is free of the above‐mentioned limitations and name it a contaminated transformation conditional mixture model and demonstrate on a series of simulation studies that it can effectively account for skewness and heavy tails. Applications to real‐life data sets show good results and highlight the promise of the proposed model.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"48 1","pages":""},"PeriodicalIF":0.7000,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stat","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1002/sta4.608","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
Overparameterization is a serious concern for multivariate mixture models as it can lead to model overfitting and, as a result, mixture order underestimation. Parsimonious modelling is one of the most effective remedies in this context. In Gaussian mixture models, the majority of parameters is associated with covariance matrices and parsimonious models based on factor analysers and spectral decomposition of dispersion parameters are the most popular in literature. Some drawbacks of these models include the lack of flexibility in imposing different covariance structures for individual components and limitations in modelling compact clusters. Recently introduced conditional mixture models provide substantial flexibility in addressing these concerns. The components of such mixtures are formulated as a product of conditional distributions with univariate Gaussian densities being the primary choice. However, the presence of heavy tails or skewness in any dimension can lead to fitting problems. We propose a flexible model that is free of the above‐mentioned limitations and name it a contaminated transformation conditional mixture model and demonstrate on a series of simulation studies that it can effectively account for skewness and heavy tails. Applications to real‐life data sets show good results and highlight the promise of the proposed model.
StatDecision Sciences-Statistics, Probability and Uncertainty
CiteScore
1.10
自引率
0.00%
发文量
85
期刊介绍:
Stat is an innovative electronic journal for the rapid publication of novel and topical research results, publishing compact articles of the highest quality in all areas of statistical endeavour. Its purpose is to provide a means of rapid sharing of important new theoretical, methodological and applied research. Stat is a joint venture between the International Statistical Institute and Wiley-Blackwell.
Stat is characterised by:
• Speed - a high-quality review process that aims to reach a decision within 20 days of submission.
• Concision - a maximum article length of 10 pages of text, not including references.
• Supporting materials - inclusion of electronic supporting materials including graphs, video, software, data and images.
• Scope - addresses all areas of statistics and interdisciplinary areas.
Stat is a scientific journal for the international community of statisticians and researchers and practitioners in allied quantitative disciplines.