{"title":"Similarity-Reduced Diversities: the Effective Entropy and the Reduced Entropy.","authors":"François Bavaud","doi":"10.1007/s00357-021-09395-4","DOIUrl":"https://doi.org/10.1007/s00357-021-09395-4","url":null,"abstract":"<p><p>The paper presents and analyzes the properties of a new diversity index, the effective entropy, which lowers Shannon entropy by taking into account the presence of similarities between items. Similarities decrease exponentially with the item dissimilarities, with a freely adjustable discriminability parameter controlling various diversity regimes separated by phase transitions. Effective entropies are determined iteratively, and turn out to be concave and subadditive, in contrast to the reduced entropy, proposed in Ecology for similar purposes. Two data sets are used to illustrate the formalism, and underline the role played by the dissimilarity types.</p>","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924145/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40305236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sunghoon Kim, Ashley Stadler Blank, W. DeSarbo, J. Vermunt
{"title":"Erratum to: The Spatial Representation of Consumer Dispersion Patterns via a New Multi-level Latent Class Methodology","authors":"Sunghoon Kim, Ashley Stadler Blank, W. DeSarbo, J. Vermunt","doi":"10.1007/s00357-021-09405-5","DOIUrl":"https://doi.org/10.1007/s00357-021-09405-5","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45578789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chimeral Clustering","authors":"Jason Hou-Liu, R. Browne","doi":"10.1007/s00357-021-09396-3","DOIUrl":"https://doi.org/10.1007/s00357-021-09396-3","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"51951605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MatTransMix: an R Package for Matrix Model-Based Clustering and Parsimonious Mixture Modeling","authors":"Zhu, Xuwen, Sarkar, Shuchismita, Melnykov, Volodymyr","doi":"10.1007/s00357-021-09401-9","DOIUrl":"https://doi.org/10.1007/s00357-021-09401-9","url":null,"abstract":"<p>Finite mixture modeling, expanded to matrix-valued data, faces several challenges. One of the major concerns is overparameterization resulting from the high number of parameters involved in a matrix mixture. In addition, an appropriate power transformation is very useful if the data are skewed. The R package MatTransMix is a new piece of software devoted to parsimonious models, based on spectral decomposition of covariance matrices, developed for fitting heterogeneous matrix-valued data providing model-based clustering results. The package implements a variety of parsimonious models obtained from various combinations of spectral decomposition and skewness parameters. The paper discusses some methodological foundations of the proposed models and elaborates the functions available in this package on carefully chosen examples.</p>","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2021-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138536000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Erratum to: On Finite Mixture Modeling of Change-Point Processes","authors":"Xuwen Zhu, Yana Melnykov","doi":"10.1007/s00357-021-09400-w","DOIUrl":"https://doi.org/10.1007/s00357-021-09400-w","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2021-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"51951640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interaction Identification and Clique Screening for Classification with Ultra-high Dimensional Discrete Features","authors":"An, Baiguo, Feng, Guozhong, Guo, Jianhua","doi":"10.1007/s00357-021-09399-0","DOIUrl":"https://doi.org/10.1007/s00357-021-09399-0","url":null,"abstract":"<p>Interactions have greatly influenced recent scientific discoveries, but the identification of interactions is challenging in ultra-high dimensions. In this study, we propose an interaction identification method for classification with ultra-high dimensional discrete features. We utilize clique sets to capture interactions among features, where features in a common clique have interactions that can be used for classification. The number of features related to the interaction is the size of the clique. Hence, our method can consider interactions caused by more than two feature variables. We propose a Kullback-Leibler divergence-based approach to correctly identify the clique sets with a probability that tends to 1 as the sample size tends to infinity. A clique screening method is then proposed to filter out clique sets that are useless for classification, and the strong sure screening property can be guaranteed. Finally, a clique naïve Bayes classifier is proposed for classification. Numerical studies demonstrate that our proposed approach performs very well.</p>","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2021-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138536010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sunghoon Kim, Ashley Stadler Blank, W. DeSarbo, J. Vermunt
{"title":"The Spatial Representation of Consumer Dispersion Patterns via a New Multi-level Latent Class Methodology","authors":"Sunghoon Kim, Ashley Stadler Blank, W. DeSarbo, J. Vermunt","doi":"10.1007/s00357-021-09398-1","DOIUrl":"https://doi.org/10.1007/s00357-021-09398-1","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2021-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42586415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparing Boosting and Bagging for Decision Trees of Rankings","authors":"Plaia, Antonella, Buscemi, Simona, Fürnkranz, Johannes, Mencía, Eneldo Loza","doi":"10.1007/s00357-021-09397-2","DOIUrl":"https://doi.org/10.1007/s00357-021-09397-2","url":null,"abstract":"<p>Decision tree learning is among the most popular and most traditional families of machine learning algorithms. While these techniques excel in being quite intuitive and interpretable, they also suffer from instability: small perturbations in the training data may result in big changes in the predictions. The so-called ensemble methods combine the output of multiple trees, which makes the decision more reliable and stable. They have been primarily applied to numeric prediction problems and to classification tasks. In the last years, some attempts to extend the ensemble methods to ordinal data can be found in the literature, but no concrete methodology has been provided for preference data. In this paper, we extend decision trees, and in the following also ensemble methods to ranking data. In particular, we propose a theoretical and computational definition of bagging and boosting, two of the best known ensemble methods. In an experimental study using simulated data and real-world datasets, our results confirm that known results from classification, such as that boosting outperforms bagging, could be successfully carried over to the ranking case.</p>","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2021-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138536001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Partition of Interval-Valued Observations Using Regression","authors":"Fei Liu, L. Billard","doi":"10.1007/s00357-021-09394-5","DOIUrl":"https://doi.org/10.1007/s00357-021-09394-5","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2021-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s00357-021-09394-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"51951585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Gibbs Sampling Algorithm with Monotonicity Constraints for Diagnostic Classification Models","authors":"K. Yamaguchi, J. Templin","doi":"10.1007/s00357-021-09392-7","DOIUrl":"https://doi.org/10.1007/s00357-021-09392-7","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2021-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s00357-021-09392-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44749155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}