{"title":"Infinite Langevin Mixture Modeling and Feature Selection","authors":"Ola Amayri, N. Bouguila","doi":"10.1109/DSAA.2016.22","DOIUrl":null,"url":null,"abstract":"In this paper, we introduce data clustering based on infinite mixture models for spherical patterns. This particular clustering is based on Langevin distribution which has been shown to be effective to model this kind of data. The proposed learning algorithm is tackled using a fully Bayesian approach. In contrast to classical Bayesian approaches, which suppose an unknown finite number of mixture components, proposed approach assumes an infinite number of components and have witnessed considerable theoretical and computational advances in recent years. In particular, we have developed a Markov Chain Monte Carlo (MCMC) algorithm to sample from the posterior distributions associated with the selected priors for the different model parameters. Moreover, we propose an infinite framework that allows simultaneous feature selection selection and parameter estimation. The usefulness of the developed framework has been shown via topic novelty detection application.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSAA.2016.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
In this paper, we introduce data clustering based on infinite mixture models for spherical patterns. This particular clustering is based on Langevin distribution which has been shown to be effective to model this kind of data. The proposed learning algorithm is tackled using a fully Bayesian approach. In contrast to classical Bayesian approaches, which suppose an unknown finite number of mixture components, proposed approach assumes an infinite number of components and have witnessed considerable theoretical and computational advances in recent years. In particular, we have developed a Markov Chain Monte Carlo (MCMC) algorithm to sample from the posterior distributions associated with the selected priors for the different model parameters. Moreover, we propose an infinite framework that allows simultaneous feature selection selection and parameter estimation. The usefulness of the developed framework has been shown via topic novelty detection application.