乳腺癌患者代谢组学小组的生物标志物发现和预后预测模型的开发:一种整合机器学习和可解释人工智能的混合方法。

IF 3.9 3区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
Frontiers in Molecular Biosciences Pub Date : 2024-12-18 eCollection Date: 2024-01-01 DOI:10.3389/fmolb.2024.1426964
Fatma Hilal Yagin, Yasin Gormez, Fahaid Al-Hashem, Irshad Ahmad, Fuzail Ahmad, Luca Paolo Ardigò
{"title":"乳腺癌患者代谢组学小组的生物标志物发现和预后预测模型的开发:一种整合机器学习和可解释人工智能的混合方法。","authors":"Fatma Hilal Yagin, Yasin Gormez, Fahaid Al-Hashem, Irshad Ahmad, Fuzail Ahmad, Luca Paolo Ardigò","doi":"10.3389/fmolb.2024.1426964","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Breast cancer (BC) is a significant cause of morbidity and mortality in women. Although the important role of metabolism in the molecular pathogenesis of BC is known, there is still a need for robust metabolomic biomarkers and predictive models that will enable the detection and prognosis of BC. This study aims to identify targeted metabolomic biomarker candidates based on explainable artificial intelligence (XAI) for the specific detection of BC.</p><p><strong>Methods: </strong>Data obtained after targeted metabolomics analyses using plasma samples from BC patients (n = 102) and healthy controls (n = 99) were used. Machine learning (ML) models based on raw data were developed, then feature selection methods were applied, and the results were compared. SHapley Additive exPlanations (SHAP), an XAI method, was used to clinically explain the decisions of the optimal model in BC prediction.</p><p><strong>Results: </strong>The results revealed that variable selection increased the performance of ML models in BC classification, and the optimal model was obtained with the logistic regression (LR) classifier after support vector machine (SVM)-SHAP-based feature selection. SHAP annotations of the LR model revealed that Leucine, isoleucine, L-alloisoleucine, norleucine, and homoserine acids were the most important potential BC diagnostic biomarkers. Combining the identified metabolite markers provided robust BC classification measures with precision, recall, and specificity of 89.50%, 88.38%, and 83.67%, respectively.</p><p><strong>Conclusion: </strong>In conclusion, this study adds valuable information to the discovery of BC biomarkers and underscores the potential of targeted metabolomics-based diagnostic advances in the management of BC.</p>","PeriodicalId":12465,"journal":{"name":"Frontiers in Molecular Biosciences","volume":"11 ","pages":"1426964"},"PeriodicalIF":3.9000,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11688212/pdf/","citationCount":"0","resultStr":"{\"title\":\"Biomarker discovery and development of prognostic prediction model using metabolomic panel in breast cancer patients: a hybrid methodology integrating machine learning and explainable artificial intelligence.\",\"authors\":\"Fatma Hilal Yagin, Yasin Gormez, Fahaid Al-Hashem, Irshad Ahmad, Fuzail Ahmad, Luca Paolo Ardigò\",\"doi\":\"10.3389/fmolb.2024.1426964\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Breast cancer (BC) is a significant cause of morbidity and mortality in women. Although the important role of metabolism in the molecular pathogenesis of BC is known, there is still a need for robust metabolomic biomarkers and predictive models that will enable the detection and prognosis of BC. This study aims to identify targeted metabolomic biomarker candidates based on explainable artificial intelligence (XAI) for the specific detection of BC.</p><p><strong>Methods: </strong>Data obtained after targeted metabolomics analyses using plasma samples from BC patients (n = 102) and healthy controls (n = 99) were used. Machine learning (ML) models based on raw data were developed, then feature selection methods were applied, and the results were compared. SHapley Additive exPlanations (SHAP), an XAI method, was used to clinically explain the decisions of the optimal model in BC prediction.</p><p><strong>Results: </strong>The results revealed that variable selection increased the performance of ML models in BC classification, and the optimal model was obtained with the logistic regression (LR) classifier after support vector machine (SVM)-SHAP-based feature selection. SHAP annotations of the LR model revealed that Leucine, isoleucine, L-alloisoleucine, norleucine, and homoserine acids were the most important potential BC diagnostic biomarkers. Combining the identified metabolite markers provided robust BC classification measures with precision, recall, and specificity of 89.50%, 88.38%, and 83.67%, respectively.</p><p><strong>Conclusion: </strong>In conclusion, this study adds valuable information to the discovery of BC biomarkers and underscores the potential of targeted metabolomics-based diagnostic advances in the management of BC.</p>\",\"PeriodicalId\":12465,\"journal\":{\"name\":\"Frontiers in Molecular Biosciences\",\"volume\":\"11 \",\"pages\":\"1426964\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11688212/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Molecular Biosciences\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.3389/fmolb.2024.1426964\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Molecular Biosciences","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3389/fmolb.2024.1426964","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:乳腺癌(BC)是女性发病和死亡的重要原因。虽然代谢在BC分子发病机制中的重要作用是已知的,但仍然需要强大的代谢组学生物标志物和预测模型来检测和预后BC。本研究旨在基于可解释人工智能(explainable artificial intelligence, XAI)识别特异性检测BC的代谢组学候选标志物。方法:使用BC患者(n = 102)和健康对照(n = 99)的血浆样本进行靶向代谢组学分析后获得的数据。建立基于原始数据的机器学习模型,应用特征选择方法,并对结果进行比较。SHapley Additive exPlanations (SHAP)是一种XAI方法,用于临床解释BC预测中最佳模型的决定。结果:结果表明,变量选择提高了ML模型在BC分类中的性能,并且在基于支持向量机(SVM)- shap的特征选择之后,使用逻辑回归(LR)分类器获得了最优模型。LR模型的SHAP注释显示亮氨酸、异亮氨酸、l -异亮氨酸、去甲亮氨酸和同型丝氨酸是最重要的潜在BC诊断生物标志物。结合鉴定的代谢物标记物提供了可靠的BC分类方法,其精确度、召回率和特异性分别为89.50%、88.38%和83.67%。结论:总之,本研究为发现BC生物标志物提供了有价值的信息,并强调了基于代谢组学的靶向诊断在BC治疗中的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Biomarker discovery and development of prognostic prediction model using metabolomic panel in breast cancer patients: a hybrid methodology integrating machine learning and explainable artificial intelligence.

Background: Breast cancer (BC) is a significant cause of morbidity and mortality in women. Although the important role of metabolism in the molecular pathogenesis of BC is known, there is still a need for robust metabolomic biomarkers and predictive models that will enable the detection and prognosis of BC. This study aims to identify targeted metabolomic biomarker candidates based on explainable artificial intelligence (XAI) for the specific detection of BC.

Methods: Data obtained after targeted metabolomics analyses using plasma samples from BC patients (n = 102) and healthy controls (n = 99) were used. Machine learning (ML) models based on raw data were developed, then feature selection methods were applied, and the results were compared. SHapley Additive exPlanations (SHAP), an XAI method, was used to clinically explain the decisions of the optimal model in BC prediction.

Results: The results revealed that variable selection increased the performance of ML models in BC classification, and the optimal model was obtained with the logistic regression (LR) classifier after support vector machine (SVM)-SHAP-based feature selection. SHAP annotations of the LR model revealed that Leucine, isoleucine, L-alloisoleucine, norleucine, and homoserine acids were the most important potential BC diagnostic biomarkers. Combining the identified metabolite markers provided robust BC classification measures with precision, recall, and specificity of 89.50%, 88.38%, and 83.67%, respectively.

Conclusion: In conclusion, this study adds valuable information to the discovery of BC biomarkers and underscores the potential of targeted metabolomics-based diagnostic advances in the management of BC.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Frontiers in Molecular Biosciences
Frontiers in Molecular Biosciences Biochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
7.20
自引率
4.00%
发文量
1361
审稿时长
14 weeks
期刊介绍: Much of contemporary investigation in the life sciences is devoted to the molecular-scale understanding of the relationships between genes and the environment — in particular, dynamic alterations in the levels, modifications, and interactions of cellular effectors, including proteins. Frontiers in Molecular Biosciences offers an international publication platform for basic as well as applied research; we encourage contributions spanning both established and emerging areas of biology. To this end, the journal draws from empirical disciplines such as structural biology, enzymology, biochemistry, and biophysics, capitalizing as well on the technological advancements that have enabled metabolomics and proteomics measurements in massively parallel throughput, and the development of robust and innovative computational biology strategies. We also recognize influences from medicine and technology, welcoming studies in molecular genetics, molecular diagnostics and therapeutics, and nanotechnology. Our ultimate objective is the comprehensive illustration of the molecular mechanisms regulating proteins, nucleic acids, carbohydrates, lipids, and small metabolites in organisms across all branches of life. In addition to interesting new findings, techniques, and applications, Frontiers in Molecular Biosciences will consider new testable hypotheses to inspire different perspectives and stimulate scientific dialogue. The integration of in silico, in vitro, and in vivo approaches will benefit endeavors across all domains of the life sciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信