Salvatore D. Tomarchio, Luca Bagnato, Antonio Punzo
{"title":"Model-based clustering via new parsimonious mixtures of heavy-tailed distributions","authors":"Salvatore D. Tomarchio, Luca Bagnato, Antonio Punzo","doi":"10.1007/s10182-021-00430-8","DOIUrl":null,"url":null,"abstract":"<div><p>Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.</p></div>","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s10182-021-00430-8","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 6
Abstract
Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.