无限朗之万混合建模与特征选择

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Pub Date : 2016-10-01 DOI:10.1109/DSAA.2016.22

Ola Amayri, N. Bouguila

{"title":"无限朗之万混合建模与特征选择","authors":"Ola Amayri, N. Bouguila","doi":"10.1109/DSAA.2016.22","DOIUrl":null,"url":null,"abstract":"In this paper, we introduce data clustering based on infinite mixture models for spherical patterns. This particular clustering is based on Langevin distribution which has been shown to be effective to model this kind of data. The proposed learning algorithm is tackled using a fully Bayesian approach. In contrast to classical Bayesian approaches, which suppose an unknown finite number of mixture components, proposed approach assumes an infinite number of components and have witnessed considerable theoretical and computational advances in recent years. In particular, we have developed a Markov Chain Monte Carlo (MCMC) algorithm to sample from the posterior distributions associated with the selected priors for the different model parameters. Moreover, we propose an infinite framework that allows simultaneous feature selection selection and parameter estimation. The usefulness of the developed framework has been shown via topic novelty detection application.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Infinite Langevin Mixture Modeling and Feature Selection\",\"authors\":\"Ola Amayri, N. Bouguila\",\"doi\":\"10.1109/DSAA.2016.22\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we introduce data clustering based on infinite mixture models for spherical patterns. This particular clustering is based on Langevin distribution which has been shown to be effective to model this kind of data. The proposed learning algorithm is tackled using a fully Bayesian approach. In contrast to classical Bayesian approaches, which suppose an unknown finite number of mixture components, proposed approach assumes an infinite number of components and have witnessed considerable theoretical and computational advances in recent years. In particular, we have developed a Markov Chain Monte Carlo (MCMC) algorithm to sample from the posterior distributions associated with the selected priors for the different model parameters. Moreover, we propose an infinite framework that allows simultaneous feature selection selection and parameter estimation. The usefulness of the developed framework has been shown via topic novelty detection application.\",\"PeriodicalId\":193885,\"journal\":{\"name\":\"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSAA.2016.22\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSAA.2016.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

本文介绍了基于无限混合模型的球形图案数据聚类。这种特殊的聚类是基于朗之万分布的，它已经被证明是有效的建模这类数据。所提出的学习算法使用完全贝叶斯方法来解决。与经典贝叶斯方法假设未知有限数量的混合成分不同，该方法假设无限数量的混合成分，近年来在理论和计算方面取得了相当大的进展。特别是，我们开发了一种马尔可夫链蒙特卡罗(MCMC)算法，从与不同模型参数的选定先验相关的后验分布中进行采样。此外，我们提出了一个无限框架，允许同时进行特征选择和参数估计。通过主题新颖性检测应用表明了所开发框架的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Infinite Langevin Mixture Modeling and Feature Selection

In this paper, we introduce data clustering based on infinite mixture models for spherical patterns. This particular clustering is based on Langevin distribution which has been shown to be effective to model this kind of data. The proposed learning algorithm is tackled using a fully Bayesian approach. In contrast to classical Bayesian approaches, which suppose an unknown finite number of mixture components, proposed approach assumes an infinite number of components and have witnessed considerable theoretical and computational advances in recent years. In particular, we have developed a Markov Chain Monte Carlo (MCMC) algorithm to sample from the posterior distributions associated with the selected priors for the different model parameters. Moreover, we propose an infinite framework that allows simultaneous feature selection selection and parameter estimation. The usefulness of the developed framework has been shown via topic novelty detection application.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)

自引率

0.00%

发文量