C. Tortora, R. Browne, Aisha Elsherbiny, B. Franczak, P. McNicholas
{"title":"Model-Based Clustering, Classification, and Discriminant Analysis Using the Generalized Hyperbolic Distribution: MixGHD R package","authors":"C. Tortora, R. Browne, Aisha Elsherbiny, B. Franczak, P. McNicholas","doi":"10.18637/jss.v098.i03","DOIUrl":null,"url":null,"abstract":"The MixGHD package for R performs model-based clustering, classification, and discriminant analysis using the generalized hyperbolic distribution (GHD). This approach is suitable for data that can be considered a realization of a (multivariate) continuous random variable. The GHD has the advantage of being flexible due to skewness, concentration, and index parameters; as such, clustering methods that use this distribution are capable of estimating clusters characterized by different shapes. The package provides five different models all based on the GHD, an efficient routine for discriminant analysis, and a function to measure cluster agreement. This paper is split into three parts: the first is devoted to the formulation of each method, extending them for classification and discriminant analysis applications, the second focuses on the algorithms, and the third shows the use of the package on real datasets.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"34 1","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Statistical Software","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.18637/jss.v098.i03","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 11
Abstract
The MixGHD package for R performs model-based clustering, classification, and discriminant analysis using the generalized hyperbolic distribution (GHD). This approach is suitable for data that can be considered a realization of a (multivariate) continuous random variable. The GHD has the advantage of being flexible due to skewness, concentration, and index parameters; as such, clustering methods that use this distribution are capable of estimating clusters characterized by different shapes. The package provides five different models all based on the GHD, an efficient routine for discriminant analysis, and a function to measure cluster agreement. This paper is split into three parts: the first is devoted to the formulation of each method, extending them for classification and discriminant analysis applications, the second focuses on the algorithms, and the third shows the use of the package on real datasets.
期刊介绍:
The Journal of Statistical Software (JSS) publishes open-source software and corresponding reproducible articles discussing all aspects of the design, implementation, documentation, application, evaluation, comparison, maintainance and distribution of software dedicated to improvement of state-of-the-art in statistical computing in all areas of empirical research. Open-source code and articles are jointly reviewed and published in this journal and should be accessible to a broad community of practitioners, teachers, and researchers in the field of statistics.