使用经验贝叶斯方法(BRIDGE)减少具有依赖性样本的微阵列数据的批次效应。

IF 0.9 4区 数学 Q3 Mathematics
Qing Xia, Jeffrey A Thompson, Devin C Koestler
{"title":"使用经验贝叶斯方法(BRIDGE)减少具有依赖性样本的微阵列数据的批次效应。","authors":"Qing Xia, Jeffrey A Thompson, Devin C Koestler","doi":"10.1515/sagmb-2021-0020","DOIUrl":null,"url":null,"abstract":"<p><p>Batch-effects present challenges in the analysis of high-throughput molecular data and are particularly problematic in longitudinal studies when interest lies in identifying genes/features whose expression changes over time, but time is confounded with batch. While many methods to correct for batch-effects exist, most assume independence across samples; an assumption that is unlikely to hold in longitudinal microarray studies. We propose <u>B</u>atch effect <u>R</u>eduction of m<u>I</u>croarray data with <u>D</u>ependent samples usin<u>G</u><u>E</u>mpirical Bayes (<i>BRIDGE</i>), a three-step parametric empirical Bayes approach that leverages technical replicate samples profiled at multiple timepoints/batches, so-called \"bridge samples\", to inform batch-effect reduction/attenuation in longitudinal microarray studies. Extensive simulation studies and an analysis of a real biological data set were conducted to benchmark the performance of <i>BRIDGE</i> against both <i>ComBat</i> and <i>longitudinal</i><i>ComBat</i>. Our results demonstrate that while all methods perform well in facilitating accurate estimates of time effects, <i>BRIDGE</i> outperforms both <i>ComBat</i> and <i>longitudinal ComBat</i> in the removal of batch-effects in data sets with bridging samples, and perhaps as a result, was observed to have improved statistical power for detecting genes with a time effect. <i>BRIDGE</i> demonstrated competitive performance in batch effect reduction of confounded longitudinal microarray studies, both in simulated and a real data sets, and may serve as a useful preprocessing method for researchers conducting longitudinal microarray studies that include bridging samples.</p>","PeriodicalId":49477,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"20 4-6","pages":"101-119"},"PeriodicalIF":0.9000,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9617207/pdf/nihms-1843789.pdf","citationCount":"0","resultStr":"{\"title\":\"Batch effect reduction of microarray data with dependent samples using an empirical Bayes approach (BRIDGE).\",\"authors\":\"Qing Xia, Jeffrey A Thompson, Devin C Koestler\",\"doi\":\"10.1515/sagmb-2021-0020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Batch-effects present challenges in the analysis of high-throughput molecular data and are particularly problematic in longitudinal studies when interest lies in identifying genes/features whose expression changes over time, but time is confounded with batch. While many methods to correct for batch-effects exist, most assume independence across samples; an assumption that is unlikely to hold in longitudinal microarray studies. We propose <u>B</u>atch effect <u>R</u>eduction of m<u>I</u>croarray data with <u>D</u>ependent samples usin<u>G</u><u>E</u>mpirical Bayes (<i>BRIDGE</i>), a three-step parametric empirical Bayes approach that leverages technical replicate samples profiled at multiple timepoints/batches, so-called \\\"bridge samples\\\", to inform batch-effect reduction/attenuation in longitudinal microarray studies. Extensive simulation studies and an analysis of a real biological data set were conducted to benchmark the performance of <i>BRIDGE</i> against both <i>ComBat</i> and <i>longitudinal</i><i>ComBat</i>. Our results demonstrate that while all methods perform well in facilitating accurate estimates of time effects, <i>BRIDGE</i> outperforms both <i>ComBat</i> and <i>longitudinal ComBat</i> in the removal of batch-effects in data sets with bridging samples, and perhaps as a result, was observed to have improved statistical power for detecting genes with a time effect. <i>BRIDGE</i> demonstrated competitive performance in batch effect reduction of confounded longitudinal microarray studies, both in simulated and a real data sets, and may serve as a useful preprocessing method for researchers conducting longitudinal microarray studies that include bridging samples.</p>\",\"PeriodicalId\":49477,\"journal\":{\"name\":\"Statistical Applications in Genetics and Molecular Biology\",\"volume\":\"20 4-6\",\"pages\":\"101-119\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2021-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9617207/pdf/nihms-1843789.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistical Applications in Genetics and Molecular Biology\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1515/sagmb-2021-0020\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Applications in Genetics and Molecular Biology","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/sagmb-2021-0020","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

摘要

批次效应给高通量分子数据分析带来了挑战,尤其是在纵向研究中,当研究兴趣在于识别表达随时间变化的基因/特征,但时间与批次混淆时,批次效应更是问题重重。虽然有很多方法可以校正批次效应,但大多数方法都假设不同样本之间是独立的,而这一假设在纵向微阵列研究中不太可能成立。我们提出了使用经验贝叶斯降低依赖样本的微阵列数据批次效应(BRIDGE),这是一种三步参数经验贝叶斯方法,它利用在多个时间点/批次剖析的技术复制样本(即所谓的 "桥样本"),为纵向微阵列研究中批次效应的降低/减弱提供信息。我们进行了广泛的模拟研究和对真实生物数据集的分析,以对照 ComBat 和 longitudinalComBat 对 BRIDGE 的性能进行基准测试。我们的结果表明,虽然所有方法都能很好地促进时间效应的准确估计,但 BRIDGE 在消除具有桥接样本的数据集中的批次效应方面优于 ComBat 和纵向 ComBat,因此,在检测具有时间效应的基因方面,BRIDGE 的统计能力也得到了提高。无论是在模拟数据集还是真实数据集中,BRIDGE 在减少纵向微阵列研究中的批次效应方面都表现出了很强的竞争力,可以作为研究人员进行包含桥接样本的纵向微阵列研究的一种有用的预处理方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Batch effect reduction of microarray data with dependent samples using an empirical Bayes approach (BRIDGE).

Batch-effects present challenges in the analysis of high-throughput molecular data and are particularly problematic in longitudinal studies when interest lies in identifying genes/features whose expression changes over time, but time is confounded with batch. While many methods to correct for batch-effects exist, most assume independence across samples; an assumption that is unlikely to hold in longitudinal microarray studies. We propose Batch effect Reduction of mIcroarray data with Dependent samples usinGEmpirical Bayes (BRIDGE), a three-step parametric empirical Bayes approach that leverages technical replicate samples profiled at multiple timepoints/batches, so-called "bridge samples", to inform batch-effect reduction/attenuation in longitudinal microarray studies. Extensive simulation studies and an analysis of a real biological data set were conducted to benchmark the performance of BRIDGE against both ComBat and longitudinalComBat. Our results demonstrate that while all methods perform well in facilitating accurate estimates of time effects, BRIDGE outperforms both ComBat and longitudinal ComBat in the removal of batch-effects in data sets with bridging samples, and perhaps as a result, was observed to have improved statistical power for detecting genes with a time effect. BRIDGE demonstrated competitive performance in batch effect reduction of confounded longitudinal microarray studies, both in simulated and a real data sets, and may serve as a useful preprocessing method for researchers conducting longitudinal microarray studies that include bridging samples.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.20
自引率
11.10%
发文量
8
审稿时长
6-12 weeks
期刊介绍: Statistical Applications in Genetics and Molecular Biology seeks to publish significant research on the application of statistical ideas to problems arising from computational biology. The focus of the papers should be on the relevant statistical issues but should contain a succinct description of the relevant biological problem being considered. The range of topics is wide and will include topics such as linkage mapping, association studies, gene finding and sequence alignment, protein structure prediction, design and analysis of microarray data, molecular evolution and phylogenetic trees, DNA topology, and data base search strategies. Both original research and review articles will be warmly received.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信