机器学习方法在早期乳腺癌诊断生物标志物发现中的分析策略：承诺、进展和展望

IF 12 1区化学 Q1 CHEMISTRY, ANALYTICAL

Trends in Analytical Chemistry Pub Date : 2025-08-14 DOI:10.1016/j.trac.2025.118412

Seyed Morteza Naghib , Mohammad Ali Khorasani , Fariborz Sharifianjazi , Ketevan Tavamaishvili

{"title":"机器学习方法在早期乳腺癌诊断生物标志物发现中的分析策略：承诺、进展和展望","authors":"Seyed Morteza Naghib , Mohammad Ali Khorasani , Fariborz Sharifianjazi , Ketevan Tavamaishvili","doi":"10.1016/j.trac.2025.118412","DOIUrl":null,"url":null,"abstract":"<div><div>Breast Cancer (BC) remains one of the leading causes of cancer-related mortality worldwide, with early detection playing a pivotal role in improving patient survival and treatment outcomes. Biomarkers serve as critical molecular indicators that facilitate the early diagnosis of breast cancer, allowing for timely intervention before the disease progresses to more advanced stages. Traditional methods for biomarker discovery, including immunohistochemistry, polymerase chain reaction (PCR), and enzyme-linked immunosorbent assays (ELISA), have been instrumental in identifying breast cancer markers. However, these approaches often require extensive validation, are time-consuming, and may lack the ability to effectively analyze high-dimensional datasets. The rapid advancements in machine learning (ML) have transformed biomarker discovery by enabling the analysis of complex multi-omics data, integrating genomic, proteomic, and imaging datasets to identify novel biomarkers with enhanced accuracy. This study focuses on the application of ML in detecting key biomarkers COL11A1, TOP2A, MMP1, and EZH2 which are associated with tumor invasiveness, proliferation, and metastatic potential in early-stage breast cancer. These biomarkers were identified through ML-based predictive models such as Random Forest (RF), Support Vector Machines (SVMs), XGBoost, and Deep Neural Networks, which have demonstrated superior performance in distinguishing malignant from benign cases. Our findings highlight the potential of ML-driven biomarker discovery in revolutionizing breast cancer diagnostics by improving risk stratification, enhancing predictive accuracy, and facilitating personalized treatment approaches. By leveraging AI-powered methodologies, clinicians can move toward a data-driven, precision medicine approach, ultimately reducing the burden of late-stage breast cancer diagnoses and mortality rates. However, integrating ML models into routine clinical practice requires addressing key challenges, such as data standardization, model interpretability, and validation through large-scale prospective studies. Future advancements in deep learning (DL), federated learning, and explainable AI (XAI) are expected to further refine these models, ensuring their reliability and applicability in clinical settings.</div></div>","PeriodicalId":439,"journal":{"name":"Trends in Analytical Chemistry","volume":"192 ","pages":"Article 118412"},"PeriodicalIF":12.0000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analytical strategies in early breast cancer diagnostic biomarker discovery by machine learning methods: Promises, advances and outlooks\",\"authors\":\"Seyed Morteza Naghib , Mohammad Ali Khorasani , Fariborz Sharifianjazi , Ketevan Tavamaishvili\",\"doi\":\"10.1016/j.trac.2025.118412\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Breast Cancer (BC) remains one of the leading causes of cancer-related mortality worldwide, with early detection playing a pivotal role in improving patient survival and treatment outcomes. Biomarkers serve as critical molecular indicators that facilitate the early diagnosis of breast cancer, allowing for timely intervention before the disease progresses to more advanced stages. Traditional methods for biomarker discovery, including immunohistochemistry, polymerase chain reaction (PCR), and enzyme-linked immunosorbent assays (ELISA), have been instrumental in identifying breast cancer markers. However, these approaches often require extensive validation, are time-consuming, and may lack the ability to effectively analyze high-dimensional datasets. The rapid advancements in machine learning (ML) have transformed biomarker discovery by enabling the analysis of complex multi-omics data, integrating genomic, proteomic, and imaging datasets to identify novel biomarkers with enhanced accuracy. This study focuses on the application of ML in detecting key biomarkers COL11A1, TOP2A, MMP1, and EZH2 which are associated with tumor invasiveness, proliferation, and metastatic potential in early-stage breast cancer. These biomarkers were identified through ML-based predictive models such as Random Forest (RF), Support Vector Machines (SVMs), XGBoost, and Deep Neural Networks, which have demonstrated superior performance in distinguishing malignant from benign cases. Our findings highlight the potential of ML-driven biomarker discovery in revolutionizing breast cancer diagnostics by improving risk stratification, enhancing predictive accuracy, and facilitating personalized treatment approaches. By leveraging AI-powered methodologies, clinicians can move toward a data-driven, precision medicine approach, ultimately reducing the burden of late-stage breast cancer diagnoses and mortality rates. However, integrating ML models into routine clinical practice requires addressing key challenges, such as data standardization, model interpretability, and validation through large-scale prospective studies. Future advancements in deep learning (DL), federated learning, and explainable AI (XAI) are expected to further refine these models, ensuring their reliability and applicability in clinical settings.</div></div>\",\"PeriodicalId\":439,\"journal\":{\"name\":\"Trends in Analytical Chemistry\",\"volume\":\"192 \",\"pages\":\"Article 118412\"},\"PeriodicalIF\":12.0000,\"publicationDate\":\"2025-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Trends in Analytical Chemistry\",\"FirstCategoryId\":\"1\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0165993625002808\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trends in Analytical Chemistry","FirstCategoryId":"1","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165993625002808","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}

引用次数: 0

摘要

乳腺癌（BC）仍然是全球癌症相关死亡的主要原因之一，早期发现在提高患者生存率和治疗效果方面起着关键作用。生物标志物是促进乳腺癌早期诊断的关键分子指标，可以在疾病发展到晚期之前及时干预。传统的生物标志物发现方法，包括免疫组织化学、聚合酶链反应（PCR）和酶联免疫吸附试验（ELISA），已经在识别乳腺癌标志物方面发挥了重要作用。然而，这些方法通常需要大量的验证，耗时，并且可能缺乏有效分析高维数据集的能力。机器学习（ML）的快速发展改变了生物标志物的发现，通过分析复杂的多组学数据，整合基因组学，蛋白质组学和成像数据集，以更高的准确性识别新的生物标志物。本研究的重点是应用ML检测早期乳腺癌中与肿瘤侵袭性、增殖和转移潜能相关的关键生物标志物COL11A1、TOP2A、MMP1和EZH2。这些生物标志物是通过随机森林（RF）、支持向量机（svm）、XGBoost和深度神经网络等基于ml的预测模型识别出来的，这些模型在区分恶性和良性病例方面表现优异。我们的研究结果强调了机器学习驱动的生物标志物发现的潜力，通过改善风险分层，提高预测准确性，促进个性化治疗方法，彻底改变乳腺癌诊断。通过利用人工智能方法，临床医生可以转向数据驱动的精准医疗方法，最终减少晚期乳腺癌诊断的负担和死亡率。然而，将机器学习模型整合到常规临床实践中需要解决关键挑战，如数据标准化、模型可解释性和通过大规模前瞻性研究进行验证。深度学习（DL）、联邦学习和可解释人工智能（XAI）的未来发展有望进一步完善这些模型，确保它们在临床环境中的可靠性和适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Analytical strategies in early breast cancer diagnostic biomarker discovery by machine learning methods: Promises, advances and outlooks

查看原文本刊更多论文

Analytical strategies in early breast cancer diagnostic biomarker discovery by machine learning methods: Promises, advances and outlooks

Breast Cancer (BC) remains one of the leading causes of cancer-related mortality worldwide, with early detection playing a pivotal role in improving patient survival and treatment outcomes. Biomarkers serve as critical molecular indicators that facilitate the early diagnosis of breast cancer, allowing for timely intervention before the disease progresses to more advanced stages. Traditional methods for biomarker discovery, including immunohistochemistry, polymerase chain reaction (PCR), and enzyme-linked immunosorbent assays (ELISA), have been instrumental in identifying breast cancer markers. However, these approaches often require extensive validation, are time-consuming, and may lack the ability to effectively analyze high-dimensional datasets. The rapid advancements in machine learning (ML) have transformed biomarker discovery by enabling the analysis of complex multi-omics data, integrating genomic, proteomic, and imaging datasets to identify novel biomarkers with enhanced accuracy. This study focuses on the application of ML in detecting key biomarkers COL11A1, TOP2A, MMP1, and EZH2 which are associated with tumor invasiveness, proliferation, and metastatic potential in early-stage breast cancer. These biomarkers were identified through ML-based predictive models such as Random Forest (RF), Support Vector Machines (SVMs), XGBoost, and Deep Neural Networks, which have demonstrated superior performance in distinguishing malignant from benign cases. Our findings highlight the potential of ML-driven biomarker discovery in revolutionizing breast cancer diagnostics by improving risk stratification, enhancing predictive accuracy, and facilitating personalized treatment approaches. By leveraging AI-powered methodologies, clinicians can move toward a data-driven, precision medicine approach, ultimately reducing the burden of late-stage breast cancer diagnoses and mortality rates. However, integrating ML models into routine clinical practice requires addressing key challenges, such as data standardization, model interpretability, and validation through large-scale prospective studies. Future advancements in deep learning (DL), federated learning, and explainable AI (XAI) are expected to further refine these models, ensuring their reliability and applicability in clinical settings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Trends in Analytical Chemistry 化学-分析化学

CiteScore

20.00

自引率

4.60%

发文量

257

审稿时长

3.4 months

期刊介绍： TrAC publishes succinct and critical overviews of recent advancements in analytical chemistry, designed to assist analytical chemists and other users of analytical techniques. These reviews offer excellent, up-to-date, and timely coverage of various topics within analytical chemistry. Encompassing areas such as analytical instrumentation, biomedical analysis, biomolecular analysis, biosensors, chemical analysis, chemometrics, clinical chemistry, drug discovery, environmental analysis and monitoring, food analysis, forensic science, laboratory automation, materials science, metabolomics, pesticide-residue analysis, pharmaceutical analysis, proteomics, surface science, and water analysis and monitoring, these critical reviews provide comprehensive insights for practitioners in the field.