A metagenomic hybrid classifier for paediatric inflammatory bowel disease

B. Wingfield, S. Coleman, T. McGinnity, A. Bjourson
{"title":"A metagenomic hybrid classifier for paediatric inflammatory bowel disease","authors":"B. Wingfield, S. Coleman, T. McGinnity, A. Bjourson","doi":"10.1109/IJCNN.2016.7727318","DOIUrl":null,"url":null,"abstract":"Inflammatory bowel disease (IBD) is a group of inflammatory diseases of the human colon and small intestine. IBD symptoms are non-specific; diagnosis can be delayed because an invasive colonoscopy is required for confirmation. Delayed diagnosis is linked to poor growth in children. Imbalances in the human intestinal microbiome - the community of microorganisms that reside in the human gut - are thought to contribute to the development of IBD. Work done to date in classifying host health statuses from patterns in human microbiomes with supervised learning algorithms has focused on modelling what is present in the gut (i.e. a bacterial census) with the random forest algorithm. Metagenomic shotgun sequencing is required to understand what is occurring in the gut (i.e. gene functions) and is often cost prohibitive for hundreds of samples. However, gene functions can be predicted with the Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PiCRUSt) software package, which could represent a valuable source of new features. In this paper we investigate feature relevance across the feature set with the Boruta algorithm. We find that the majority of relevant features are from the predicted metagenome. Support vector machines (SVM) and multilayer perceptrons (MLP) are rarely used with microbiomic datasets but offer several theoretical advantages. To determine if the new features and alternative algorithms are appropriate, we experiment with a range of machine learning and computational intelligence algorithms. With the best performing algorithms we also implement a conditional multiple classifier system that can identify IBD presence, IBD subtype, and IBD activity from a non-invasive stool sample.","PeriodicalId":109405,"journal":{"name":"2016 International Joint Conference on Neural Networks (IJCNN)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.2016.7727318","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Inflammatory bowel disease (IBD) is a group of inflammatory diseases of the human colon and small intestine. IBD symptoms are non-specific; diagnosis can be delayed because an invasive colonoscopy is required for confirmation. Delayed diagnosis is linked to poor growth in children. Imbalances in the human intestinal microbiome - the community of microorganisms that reside in the human gut - are thought to contribute to the development of IBD. Work done to date in classifying host health statuses from patterns in human microbiomes with supervised learning algorithms has focused on modelling what is present in the gut (i.e. a bacterial census) with the random forest algorithm. Metagenomic shotgun sequencing is required to understand what is occurring in the gut (i.e. gene functions) and is often cost prohibitive for hundreds of samples. However, gene functions can be predicted with the Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PiCRUSt) software package, which could represent a valuable source of new features. In this paper we investigate feature relevance across the feature set with the Boruta algorithm. We find that the majority of relevant features are from the predicted metagenome. Support vector machines (SVM) and multilayer perceptrons (MLP) are rarely used with microbiomic datasets but offer several theoretical advantages. To determine if the new features and alternative algorithms are appropriate, we experiment with a range of machine learning and computational intelligence algorithms. With the best performing algorithms we also implement a conditional multiple classifier system that can identify IBD presence, IBD subtype, and IBD activity from a non-invasive stool sample.
儿童炎症性肠病的宏基因组杂交分类器
炎症性肠病(IBD)是人类结肠和小肠的一组炎症性疾病。IBD症状是非特异性的;诊断可能会延迟,因为需要进行侵入性结肠镜检查进行确认。延迟诊断与儿童发育不良有关。人类肠道微生物群(居住在人类肠道中的微生物群落)的不平衡被认为是导致IBD发展的原因。迄今为止,在利用监督学习算法从人类微生物组的模式中对宿主健康状况进行分类方面所做的工作主要集中在利用随机森林算法对肠道中存在的东西(即细菌普查)进行建模。为了了解肠道中发生了什么(即基因功能),需要进行宏基因组鸟枪测序,并且对于数百个样本来说,通常成本过高。然而,基因功能可以用群落的系统发育调查通过重建未观察状态(PiCRUSt)软件包来预测,这可能是一个有价值的新特征来源。在本文中,我们使用Boruta算法研究了跨特征集的特征相关性。我们发现大多数相关特征来自于预测的宏基因组。支持向量机(SVM)和多层感知器(MLP)很少用于微生物组数据集,但在理论上具有一些优势。为了确定新功能和替代算法是否合适,我们对一系列机器学习和计算智能算法进行了实验。利用性能最好的算法,我们还实现了一个条件多重分类器系统,该系统可以从非侵入性粪便样本中识别IBD是否存在、IBD亚型和IBD活动。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信