aiGeneR 1.0:揭示大肠杆菌中信息基因和抗生素耐药基因的人工智能技术。

IF 3.3 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
Debasish Swapnesh Kumar Nayak, Saswati Mahapatra, Sweta Padma Routray, Swayamprabha Sahoo, Santanu Kumar Sahoo, Mostafa M Fouda, Narpinder Singh, Esma R Isenovic, Luca Saba, Jasjit S Suri, Tripti Swarnkar
{"title":"aiGeneR 1.0:揭示大肠杆菌中信息基因和抗生素耐药基因的人工智能技术。","authors":"Debasish Swapnesh Kumar Nayak, Saswati Mahapatra, Sweta Padma Routray, Swayamprabha Sahoo, Santanu Kumar Sahoo, Mostafa M Fouda, Narpinder Singh, Esma R Isenovic, Luca Saba, Jasjit S Suri, Tripti Swarnkar","doi":"10.31083/j.fbl2902082","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>There are several antibiotic resistance genes (ARG) for the <i>Escherichia coli (E. coli)</i> bacteria that cause urinary tract infections (UTI), and it is therefore important to identify these ARG. Artificial Intelligence (AI) has been used previously in the field of gene expression data, but never adopted for the detection and classification of bacterial ARG. We hypothesize, if the data is correctly conferred, right features are selected, and Deep Learning (DL) classification models are optimized, then (i) non-linear DL models would perform better than Machine Learning (ML) models, (ii) leads to higher accuracy, (iii) can identify the hub genes, and, (iv) can identify gene pathways accurately. We have therefore designed aiGeneR, the first of its kind system that uses DL-based models to identify ARG in <i>E. coli</i> in gene expression data.</p><p><strong>Methodology: </strong>The aiGeneR consists of a tandem connection of quality control embedded with feature extraction and AI-based classification of ARG. We adopted a cross-validation approach to evaluate the performance of aiGeneR using accuracy, precision, recall, and F1-score. Further, we analyzed the effect of sample size ensuring generalization of models and compare against the power analysis. The aiGeneR was validated scientifically and biologically for hub genes and pathways. We benchmarked aiGeneR against two linear and two other non-linear AI models.</p><p><strong>Results: </strong>The aiGeneR identifies tetM (an ARG) and showed an accuracy of 93% with area under the curve (AUC) of 0.99 (<i>p</i> < 0.05). The mean accuracy of non-linear models was 22% higher compared to linear models. We scientifically and biologically validated the aiGeneR.</p><p><strong>Conclusions: </strong>aiGeneR successfully detected the <i>E. coli</i> genes validating our four hypotheses.</p>","PeriodicalId":73069,"journal":{"name":"Frontiers in bioscience (Landmark edition)","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"aiGeneR 1.0: An Artificial Intelligence Technique for the Revelation of Informative and Antibiotic Resistant Genes in <i>Escherichia coli</i>.\",\"authors\":\"Debasish Swapnesh Kumar Nayak, Saswati Mahapatra, Sweta Padma Routray, Swayamprabha Sahoo, Santanu Kumar Sahoo, Mostafa M Fouda, Narpinder Singh, Esma R Isenovic, Luca Saba, Jasjit S Suri, Tripti Swarnkar\",\"doi\":\"10.31083/j.fbl2902082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>There are several antibiotic resistance genes (ARG) for the <i>Escherichia coli (E. coli)</i> bacteria that cause urinary tract infections (UTI), and it is therefore important to identify these ARG. Artificial Intelligence (AI) has been used previously in the field of gene expression data, but never adopted for the detection and classification of bacterial ARG. We hypothesize, if the data is correctly conferred, right features are selected, and Deep Learning (DL) classification models are optimized, then (i) non-linear DL models would perform better than Machine Learning (ML) models, (ii) leads to higher accuracy, (iii) can identify the hub genes, and, (iv) can identify gene pathways accurately. We have therefore designed aiGeneR, the first of its kind system that uses DL-based models to identify ARG in <i>E. coli</i> in gene expression data.</p><p><strong>Methodology: </strong>The aiGeneR consists of a tandem connection of quality control embedded with feature extraction and AI-based classification of ARG. We adopted a cross-validation approach to evaluate the performance of aiGeneR using accuracy, precision, recall, and F1-score. Further, we analyzed the effect of sample size ensuring generalization of models and compare against the power analysis. The aiGeneR was validated scientifically and biologically for hub genes and pathways. We benchmarked aiGeneR against two linear and two other non-linear AI models.</p><p><strong>Results: </strong>The aiGeneR identifies tetM (an ARG) and showed an accuracy of 93% with area under the curve (AUC) of 0.99 (<i>p</i> < 0.05). The mean accuracy of non-linear models was 22% higher compared to linear models. We scientifically and biologically validated the aiGeneR.</p><p><strong>Conclusions: </strong>aiGeneR successfully detected the <i>E. coli</i> genes validating our four hypotheses.</p>\",\"PeriodicalId\":73069,\"journal\":{\"name\":\"Frontiers in bioscience (Landmark edition)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in bioscience (Landmark edition)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31083/j.fbl2902082\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in bioscience (Landmark edition)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31083/j.fbl2902082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:导致尿路感染(UTI)的大肠杆菌(E. coli)有多种抗生素耐药基因(ARG),因此识别这些ARG非常重要。人工智能(AI)以前曾用于基因表达数据领域,但从未用于细菌 ARG 的检测和分类。我们假设,如果能正确赋予数据、选择正确的特征并优化深度学习(DL)分类模型,那么:(i) 非线性 DL 模型将比机器学习(ML)模型表现得更好;(ii) 能带来更高的准确率;(iii) 能识别枢纽基因;(iv) 能准确识别基因通路。因此,我们设计了 aiGeneR,这是首个使用基于 DL 的模型在基因表达数据中识别大肠杆菌中 ARG 的系统:aiGeneR由质量控制与特征提取和基于人工智能的ARG分类串联组成。我们采用交叉验证的方法,用准确率、精确度、召回率和 F1 分数来评估 aiGeneR 的性能。此外,我们还分析了样本量对确保模型泛化的影响,并与功率分析进行了比较。针对枢纽基因和通路,我们对 aiGeneR 进行了科学和生物学验证。我们用两个线性人工智能模型和两个非线性人工智能模型对 aiGeneR 进行了基准测试:aiGeneR能识别tetM(一个ARG),准确率为93%,曲线下面积(AUC)为0.99(p < 0.05)。与线性模型相比,非线性模型的平均准确率高出 22%。结论:aiGeneR 成功检测了大肠杆菌基因,验证了我们的四个假设。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
aiGeneR 1.0: An Artificial Intelligence Technique for the Revelation of Informative and Antibiotic Resistant Genes in Escherichia coli.

Background: There are several antibiotic resistance genes (ARG) for the Escherichia coli (E. coli) bacteria that cause urinary tract infections (UTI), and it is therefore important to identify these ARG. Artificial Intelligence (AI) has been used previously in the field of gene expression data, but never adopted for the detection and classification of bacterial ARG. We hypothesize, if the data is correctly conferred, right features are selected, and Deep Learning (DL) classification models are optimized, then (i) non-linear DL models would perform better than Machine Learning (ML) models, (ii) leads to higher accuracy, (iii) can identify the hub genes, and, (iv) can identify gene pathways accurately. We have therefore designed aiGeneR, the first of its kind system that uses DL-based models to identify ARG in E. coli in gene expression data.

Methodology: The aiGeneR consists of a tandem connection of quality control embedded with feature extraction and AI-based classification of ARG. We adopted a cross-validation approach to evaluate the performance of aiGeneR using accuracy, precision, recall, and F1-score. Further, we analyzed the effect of sample size ensuring generalization of models and compare against the power analysis. The aiGeneR was validated scientifically and biologically for hub genes and pathways. We benchmarked aiGeneR against two linear and two other non-linear AI models.

Results: The aiGeneR identifies tetM (an ARG) and showed an accuracy of 93% with area under the curve (AUC) of 0.99 (p < 0.05). The mean accuracy of non-linear models was 22% higher compared to linear models. We scientifically and biologically validated the aiGeneR.

Conclusions: aiGeneR successfully detected the E. coli genes validating our four hypotheses.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.50
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信