Nonparametric Bayes Differential Analysis of Multigroup DNA Methylation Data.

IF 2.5 2区数学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Bayesian Analysis Pub Date : 2025-06-01 Epub Date: 2023-11-23 DOI:10.1214/23-ba1407

Chiyu Gu, Veerabhadran Baladandayuthapani, Subharup Guha

{"title":"Nonparametric Bayes Differential Analysis of Multigroup DNA Methylation Data.","authors":"Chiyu Gu, Veerabhadran Baladandayuthapani, Subharup Guha","doi":"10.1214/23-ba1407","DOIUrl":null,"url":null,"abstract":"DNA methylation datasets in cancer studies are comprised of measurements on a large number of genomic locations called cytosine-phosphate-guanine (CpG) sites with complex correlation structures. A fundamental goal of these studies is the development of statistical techniques that can identify disease genomic signatures across multiple patient groups defined by different experimental or biological conditions. We propose BayesDiff, a nonparametric Bayesian approach for differential analysis relying on a novel class of first order mixture models called the Sticky Pitman-Yor process or two-restaurant two-cuisine franchise (2R2CF). The BayesDiff methodology flexibly utilizes information from all CpG sites or biomarker probes, adaptively accommodates any serial dependence due to the widely varying inter-probe distances, and makes posterior inferences about the differential genomic signature of patient groups. Using simulation studies, we demonstrate the effectiveness of the BayesDiff procedure relative to existing statistical techniques for differential DNA methylation. The methodology is applied to analyze a gastrointestinal (GI) cancer dataset exhibiting serial correlation and complex interaction patterns. The results support and complement known aspects of DNA methylation and gene association in upper GI cancers.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":"20 2","pages":"489-518"},"PeriodicalIF":2.5000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12094113/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bayesian Analysis","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/23-ba1407","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/11/23 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

DNA methylation datasets in cancer studies are comprised of measurements on a large number of genomic locations called cytosine-phosphate-guanine (CpG) sites with complex correlation structures. A fundamental goal of these studies is the development of statistical techniques that can identify disease genomic signatures across multiple patient groups defined by different experimental or biological conditions. We propose BayesDiff, a nonparametric Bayesian approach for differential analysis relying on a novel class of first order mixture models called the Sticky Pitman-Yor process or two-restaurant two-cuisine franchise (2R2CF). The BayesDiff methodology flexibly utilizes information from all CpG sites or biomarker probes, adaptively accommodates any serial dependence due to the widely varying inter-probe distances, and makes posterior inferences about the differential genomic signature of patient groups. Using simulation studies, we demonstrate the effectiveness of the BayesDiff procedure relative to existing statistical techniques for differential DNA methylation. The methodology is applied to analyze a gastrointestinal (GI) cancer dataset exhibiting serial correlation and complex interaction patterns. The results support and complement known aspects of DNA methylation and gene association in upper GI cancers.

查看原文本刊更多论文

多组DNA甲基化数据的非参数贝叶斯差异分析。

癌症研究中的DNA甲基化数据集由大量具有复杂相关结构的基因组位置（称为胞嘧啶-磷酸-鸟嘌呤（CpG）位点）的测量组成。这些研究的一个基本目标是发展统计技术，以识别由不同实验或生物学条件定义的多组患者的疾病基因组特征。我们提出了BayesDiff，这是一种非参数贝叶斯方法，用于微分分析，依赖于一类新的一阶混合模型，称为Sticky Pitman-Yor过程或两餐厅两美食特许经营（2R2CF）。BayesDiff方法灵活地利用来自所有CpG位点或生物标记探针的信息，自适应地适应由于探针间距离差异很大而产生的序列依赖性，并对患者群体的差异基因组特征进行后验推断。通过模拟研究，我们证明了BayesDiff程序相对于现有的差异DNA甲基化统计技术的有效性。该方法被应用于分析胃肠道（GI）癌症数据集，显示出序列相关性和复杂的相互作用模式。这些结果支持并补充了上消化道癌症中DNA甲基化和基因关联的已知方面。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bayesian Analysis 数学-数学跨学科应用

CiteScore

6.50

自引率

13.60%

发文量

审稿时长

>12 weeks

期刊介绍： Bayesian Analysis is an electronic journal of the International Society for Bayesian Analysis. It seeks to publish a wide range of articles that demonstrate or discuss Bayesian methods in some theoretical or applied context. The journal welcomes submissions involving presentation of new computational and statistical methods; critical reviews and discussions of existing approaches; historical perspectives; description of important scientific or policy application areas; case studies; and methods for experimental design, data collection, data sharing, or data mining. Evaluation of submissions is based on importance of content and effectiveness of communication. Discussion papers are typically chosen by the Editor in Chief, or suggested by an Editor, among the regular submissions. In addition, the Journal encourages individual authors to submit manuscripts for consideration as discussion papers.