受污染的吉布斯型先验

IF 4.9 2区 数学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS
F. Camerlenghi, R. Corradin, A. Ongaro
{"title":"受污染的吉布斯型先验","authors":"F. Camerlenghi, R. Corradin, A. Ongaro","doi":"10.1214/22-ba1358","DOIUrl":null,"url":null,"abstract":"Gibbs-type priors are widely used as key components in several Bayesian nonparametric models. By virtue of their flexibility and mathematical tractability, they turn out to be predominant priors in species sampling problems, clustering and mixture modelling. We introduce a new family of processes which extend the Gibbs-type one, by including a contaminant component in the model to account for the presence of anomalies (outliers) or an excess of observations with frequency one. We first investigate the induced random partition, the associated predictive distribution and we characterize the asymptotic behaviour of the number of clusters. All the results we obtain are in closed form and easily interpretable, as a noteworthy example we focus on the contaminated version of the Pitman-Yor process. Finally we pinpoint the advantage of our construction in different applied problems: we show how the contaminant component helps to perform outlier detection for an astronomical clustering problem and to improve predictive inference in a speciesrelated dataset, exhibiting a high number of species with frequency one.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":null,"pages":null},"PeriodicalIF":4.9000,"publicationDate":"2021-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Contaminated Gibbs-Type Priors\",\"authors\":\"F. Camerlenghi, R. Corradin, A. Ongaro\",\"doi\":\"10.1214/22-ba1358\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gibbs-type priors are widely used as key components in several Bayesian nonparametric models. By virtue of their flexibility and mathematical tractability, they turn out to be predominant priors in species sampling problems, clustering and mixture modelling. We introduce a new family of processes which extend the Gibbs-type one, by including a contaminant component in the model to account for the presence of anomalies (outliers) or an excess of observations with frequency one. We first investigate the induced random partition, the associated predictive distribution and we characterize the asymptotic behaviour of the number of clusters. All the results we obtain are in closed form and easily interpretable, as a noteworthy example we focus on the contaminated version of the Pitman-Yor process. Finally we pinpoint the advantage of our construction in different applied problems: we show how the contaminant component helps to perform outlier detection for an astronomical clustering problem and to improve predictive inference in a speciesrelated dataset, exhibiting a high number of species with frequency one.\",\"PeriodicalId\":55398,\"journal\":{\"name\":\"Bayesian Analysis\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2021-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bayesian Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1214/22-ba1358\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bayesian Analysis","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/22-ba1358","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

吉布斯先验被广泛地用作贝叶斯非参数模型的关键成分。由于它们的灵活性和数学可追溯性,它们在物种抽样问题、聚类和混合建模中成为优势先验。我们引入了一系列新的过程,通过在模型中包含污染物成分来解释异常(异常值)或频率为1的过量观测值的存在,从而扩展了吉布斯型过程。我们首先研究了诱导随机划分,相关的预测分布,并刻画了簇数的渐近行为。我们得到的所有结果都是封闭的,易于解释,作为一个值得注意的例子,我们关注的是Pitman-Yor过程的污染版本。最后,我们指出了我们的构建在不同应用问题中的优势:我们展示了污染物成分如何帮助执行天文聚类问题的离群值检测,并改善物种相关数据集中的预测推断,展示了频率为1的大量物种。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Contaminated Gibbs-Type Priors
Gibbs-type priors are widely used as key components in several Bayesian nonparametric models. By virtue of their flexibility and mathematical tractability, they turn out to be predominant priors in species sampling problems, clustering and mixture modelling. We introduce a new family of processes which extend the Gibbs-type one, by including a contaminant component in the model to account for the presence of anomalies (outliers) or an excess of observations with frequency one. We first investigate the induced random partition, the associated predictive distribution and we characterize the asymptotic behaviour of the number of clusters. All the results we obtain are in closed form and easily interpretable, as a noteworthy example we focus on the contaminated version of the Pitman-Yor process. Finally we pinpoint the advantage of our construction in different applied problems: we show how the contaminant component helps to perform outlier detection for an astronomical clustering problem and to improve predictive inference in a speciesrelated dataset, exhibiting a high number of species with frequency one.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Bayesian Analysis
Bayesian Analysis 数学-数学跨学科应用
CiteScore
6.50
自引率
13.60%
发文量
59
审稿时长
>12 weeks
期刊介绍: Bayesian Analysis is an electronic journal of the International Society for Bayesian Analysis. It seeks to publish a wide range of articles that demonstrate or discuss Bayesian methods in some theoretical or applied context. The journal welcomes submissions involving presentation of new computational and statistical methods; critical reviews and discussions of existing approaches; historical perspectives; description of important scientific or policy application areas; case studies; and methods for experimental design, data collection, data sharing, or data mining. Evaluation of submissions is based on importance of content and effectiveness of communication. Discussion papers are typically chosen by the Editor in Chief, or suggested by an Editor, among the regular submissions. In addition, the Journal encourages individual authors to submit manuscripts for consideration as discussion papers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信