A New Association Analysis Method for Gut Microbial Compositional Data Using Ensemble Learning

T. Okui, Y. Matsuyama, S. Nakaji
{"title":"A New Association Analysis Method for Gut Microbial Compositional Data Using Ensemble Learning","authors":"T. Okui, Y. Matsuyama, S. Nakaji","doi":"10.5691/JJB.39.55","DOIUrl":null,"url":null,"abstract":"Nowadays, many methods that employ the 16S ribosomal RNA gene (16S rRNA sequencing data) have been proposed for the analysis of gut microbial compositional data. 16S rRNA sequencing data is statistically multivariate count data. When multivariate data analysis methods are used for association analysis with a disease, 16S rRNA sequencing data is generally normalized before analysis models are fitted, because the total sequence read counts of the subjects are different. However, proper methods for normalization have not yet been discussed or proposed. Rarefying is one such normalization method that equals the total counts of subjects by subsampling a certain amount of counts from each subject. It was thought that if rarefying were combined with ensemble learning, performance improvement could be achieved. Then, we proposed an association analysis method by combining rarefying with ensemble learning and evaluated its performance by simulation experiment using several multivariate data analysis methods. The proposed method showed superior performance compared with other analysis methods, with regard to the identification ability of response-associated variables and the classification ability of a response variable. We also used each evaluated method to analyze the gut microbial data of Japanese people, and then compared these results.","PeriodicalId":365545,"journal":{"name":"Japanese journal of biometrics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Japanese journal of biometrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5691/JJB.39.55","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Nowadays, many methods that employ the 16S ribosomal RNA gene (16S rRNA sequencing data) have been proposed for the analysis of gut microbial compositional data. 16S rRNA sequencing data is statistically multivariate count data. When multivariate data analysis methods are used for association analysis with a disease, 16S rRNA sequencing data is generally normalized before analysis models are fitted, because the total sequence read counts of the subjects are different. However, proper methods for normalization have not yet been discussed or proposed. Rarefying is one such normalization method that equals the total counts of subjects by subsampling a certain amount of counts from each subject. It was thought that if rarefying were combined with ensemble learning, performance improvement could be achieved. Then, we proposed an association analysis method by combining rarefying with ensemble learning and evaluated its performance by simulation experiment using several multivariate data analysis methods. The proposed method showed superior performance compared with other analysis methods, with regard to the identification ability of response-associated variables and the classification ability of a response variable. We also used each evaluated method to analyze the gut microbial data of Japanese people, and then compared these results.
基于集成学习的肠道微生物组成数据关联分析新方法
目前,已经提出了许多利用16S核糖体RNA基因(16S rRNA测序数据)分析肠道微生物组成数据的方法。16S rRNA测序数据在统计学上是多变量计数数据。在使用多变量数据分析方法进行与疾病的关联分析时,由于受试者的总序列读取数不同,通常在拟合分析模型之前对16S rRNA测序数据进行归一化处理。但是,尚未讨论或提出正常化的适当方法。稀疏化就是这样一种归一化方法,它通过从每个受试者中抽取一定数量的计数来等于受试者的总计数。人们认为,如果将学习与集成学习相结合,就可以实现绩效的提高。然后,我们提出了一种将稀疏化与集成学习相结合的关联分析方法,并使用多种多元数据分析方法进行了仿真实验,对其性能进行了评价。该方法在响应相关变量的识别能力和响应变量的分类能力方面均优于其他分析方法。我们还使用每种评估方法分析了日本人的肠道微生物数据,然后比较了这些结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信