Optimised stacked machine learning algorithms for genomics and genetics disorder detection in the healthcare industry

IF 3.9 4区 生物学 Q1 GENETICS & HEREDITY
Amjad Rehman, Muhammad Mujahid, Tanzila Saba, Gwanggil Jeon
{"title":"Optimised stacked machine learning algorithms for genomics and genetics disorder detection in the healthcare industry","authors":"Amjad Rehman,&nbsp;Muhammad Mujahid,&nbsp;Tanzila Saba,&nbsp;Gwanggil Jeon","doi":"10.1007/s10142-024-01289-z","DOIUrl":null,"url":null,"abstract":"<div><p>With recent advances in precision medicine and healthcare computing, there is an enormous demand for developing machine learning algorithms in genomics to enhance the rapid analysis of disease disorders. Technological advancement in genomics and imaging provides clinicians with enormous amounts of data, but prediction is still mostly subjective, resulting in problematic medical treatment. Machine learning is being employed in several domains of the healthcare sector, encompassing clinical research, early disease identification, and medicinal innovation with a historical perspective. The main objective of this study is to detect patients who, based on several medical standards, are more susceptible to having a genetic disorder. A genetic disease prediction algorithm was employed, leveraging the patient’s health history to evaluate the probability of diagnosing a genetic disorder. We developed a computationally efficient machine learning approach to predict the overall lifespan of patients with a genomics disorder and to classify and predict patients with a genetic disease. The SVM, RF, and ETC are stacked using two-layer meta-estimators to develop the proposed model. The first layer comprises all the baseline models employed to predict the outcomes based on the dataset. The second layer comprises a component known as a meta-classifier. Results from the experiment indicate that the model achieved an accuracy of 90.45% and a recall score of 90.19%. The area under the curve (AUC) for mitochondrial diseases is 98.1%; for multifactorial diseases, it is 97.5%; and for single-gene inheritance, it is 98.8%. The proposed approach presents a novel method for predicting patient prognosis in a manner that is unbiased, accurate, and comprehensive. The proposed approach outperforms human professionals using the current clinical standard for genetic disease classification in terms of identification accuracy. The implementation of stacked will significantly improve the field of biomedical research by improving the anticipation of genetic diseases.</p></div>","PeriodicalId":574,"journal":{"name":"Functional & Integrative Genomics","volume":"24 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Functional & Integrative Genomics","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10142-024-01289-z","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

With recent advances in precision medicine and healthcare computing, there is an enormous demand for developing machine learning algorithms in genomics to enhance the rapid analysis of disease disorders. Technological advancement in genomics and imaging provides clinicians with enormous amounts of data, but prediction is still mostly subjective, resulting in problematic medical treatment. Machine learning is being employed in several domains of the healthcare sector, encompassing clinical research, early disease identification, and medicinal innovation with a historical perspective. The main objective of this study is to detect patients who, based on several medical standards, are more susceptible to having a genetic disorder. A genetic disease prediction algorithm was employed, leveraging the patient’s health history to evaluate the probability of diagnosing a genetic disorder. We developed a computationally efficient machine learning approach to predict the overall lifespan of patients with a genomics disorder and to classify and predict patients with a genetic disease. The SVM, RF, and ETC are stacked using two-layer meta-estimators to develop the proposed model. The first layer comprises all the baseline models employed to predict the outcomes based on the dataset. The second layer comprises a component known as a meta-classifier. Results from the experiment indicate that the model achieved an accuracy of 90.45% and a recall score of 90.19%. The area under the curve (AUC) for mitochondrial diseases is 98.1%; for multifactorial diseases, it is 97.5%; and for single-gene inheritance, it is 98.8%. The proposed approach presents a novel method for predicting patient prognosis in a manner that is unbiased, accurate, and comprehensive. The proposed approach outperforms human professionals using the current clinical standard for genetic disease classification in terms of identification accuracy. The implementation of stacked will significantly improve the field of biomedical research by improving the anticipation of genetic diseases.

Abstract Image

Abstract Image

优化叠加式机器学习算法,用于医疗保健行业的基因组学和遗传学疾病检测
随着精准医疗和医疗计算技术的不断进步,人们对开发基因组学方面的机器学习算法以提高疾病的快速分析能力有着巨大的需求。基因组学和成像技术的进步为临床医生提供了海量数据,但预测仍以主观臆断为主,导致医疗治疗问题重重。机器学习被广泛应用于医疗保健领域的多个领域,包括临床研究、早期疾病识别和具有历史意义的医药创新。本研究的主要目的是根据多项医学标准检测出哪些患者更容易患上遗传性疾病。我们采用了一种遗传疾病预测算法,利用患者的健康史来评估诊断出遗传疾病的概率。我们开发了一种计算效率高的机器学习方法,用于预测基因组学疾病患者的总体寿命,并对遗传疾病患者进行分类和预测。SVM、RF 和 ETC 使用双层元估计器进行堆叠,以开发所提出的模型。第一层包括所有基线模型,用于根据数据集预测结果。第二层包括一个称为元分类器的组件。实验结果表明,该模型的准确率为 90.45%,召回率为 90.19%。线粒体疾病的曲线下面积(AUC)为 98.1%;多因素疾病为 97.5%;单基因遗传为 98.8%。所提出的方法是一种预测病人预后的新方法,它无偏见、准确、全面。就识别准确率而言,所提出的方法优于使用当前临床标准进行遗传疾病分类的人类专业人员。通过提高对遗传疾病的预测能力,叠加法的实施将极大地改善生物医学研究领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.50
自引率
3.40%
发文量
92
审稿时长
2 months
期刊介绍: Functional & Integrative Genomics is devoted to large-scale studies of genomes and their functions, including systems analyses of biological processes. The journal will provide the research community an integrated platform where researchers can share, review and discuss their findings on important biological questions that will ultimately enable us to answer the fundamental question: How do genomes work?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信