Predicting disease recurrence in breast cancer patients using machine learning models with clinical and radiomic characteristics: a retrospective study.
Saadia Azeroual, Fatima-Ezzahraa Ben-Bouazza, Amine Naqi, Rajaa Sebihi
{"title":"Predicting disease recurrence in breast cancer patients using machine learning models with clinical and radiomic characteristics: a retrospective study.","authors":"Saadia Azeroual, Fatima-Ezzahraa Ben-Bouazza, Amine Naqi, Rajaa Sebihi","doi":"10.1186/s43046-024-00222-6","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The goal is to use three different machine learning models to predict the recurrence of breast cancer across a very heterogeneous sample of patients with varying disease kinds and stages.</p><p><strong>Methods: </strong>A heterogeneous group of patients with varying cancer kinds and stages, including both triple-negative breast cancer (TNBC) and non-triple-negative breast cancer (non-TNBC), was examined. Three distinct models were created using the following five machine learning techniques: Adaptive Boosting (AdaBoost), Random Under-sampling Boosting (RUSBoost), Extreme Gradient Boosting (XGBoost), support vector machines (SVM), and Logistic Regression. The clinical model used both clinical and pathology data in conjunction with the machine learning algorithms. The machine learning algorithms were combined with dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) imaging characteristics in the radiomic model, and the merged model combined the two types of data. Each technique was evaluated using several criteria, including the receiver operating characteristic (ROC) curve, precision, recall, and F1 score.</p><p><strong>Results: </strong>The results suggest that the integration of clinical and radiomic data improves the predictive accuracy in identifying instances of breast cancer recurrence. The XGBoost algorithm is widely recognized as the most effective algorithm in terms of performance.</p><p><strong>Conclusion: </strong>The findings presented in this study offer significant contributions to the field of breast cancer research, particularly in relation to the prediction of cancer recurrence. These insights hold great potential for informing future investigations and clinical interventions that seek to enhance the accuracy and effectiveness of recurrence prediction in breast cancer patients.</p>","PeriodicalId":17301,"journal":{"name":"Journal of the Egyptian National Cancer Institute","volume":"36 1","pages":"20"},"PeriodicalIF":2.1000,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Egyptian National Cancer Institute","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s43046-024-00222-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The goal is to use three different machine learning models to predict the recurrence of breast cancer across a very heterogeneous sample of patients with varying disease kinds and stages.
Methods: A heterogeneous group of patients with varying cancer kinds and stages, including both triple-negative breast cancer (TNBC) and non-triple-negative breast cancer (non-TNBC), was examined. Three distinct models were created using the following five machine learning techniques: Adaptive Boosting (AdaBoost), Random Under-sampling Boosting (RUSBoost), Extreme Gradient Boosting (XGBoost), support vector machines (SVM), and Logistic Regression. The clinical model used both clinical and pathology data in conjunction with the machine learning algorithms. The machine learning algorithms were combined with dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) imaging characteristics in the radiomic model, and the merged model combined the two types of data. Each technique was evaluated using several criteria, including the receiver operating characteristic (ROC) curve, precision, recall, and F1 score.
Results: The results suggest that the integration of clinical and radiomic data improves the predictive accuracy in identifying instances of breast cancer recurrence. The XGBoost algorithm is widely recognized as the most effective algorithm in terms of performance.
Conclusion: The findings presented in this study offer significant contributions to the field of breast cancer research, particularly in relation to the prediction of cancer recurrence. These insights hold great potential for informing future investigations and clinical interventions that seek to enhance the accuracy and effectiveness of recurrence prediction in breast cancer patients.
研究背景我们的目标是使用三种不同的机器学习模型来预测乳腺癌的复发情况,这些患者的疾病种类和分期各不相同:方法:研究对象是一组癌症种类和分期各不相同的患者,包括三阴性乳腺癌(TNBC)和非三阴性乳腺癌(non-TNBC)。利用以下五种机器学习技术创建了三种不同的模型:自适应提升(AdaBoost)、随机低采样提升(RUSBoost)、极梯度提升(XGBoost)、支持向量机(SVM)和逻辑回归。临床模型将临床和病理数据与机器学习算法结合使用。在放射学模型中,机器学习算法与动态对比增强磁共振成像(DCE-MRI)成像特征相结合,合并模型则将两类数据结合起来。每种技术都采用了若干标准进行评估,包括接收者操作特征曲线(ROC)、精确度、召回率和 F1 分数:结果表明,整合临床和放射组学数据可提高乳腺癌复发的预测准确性。在性能方面,XGBoost 算法被公认为最有效的算法:本研究的发现为乳腺癌研究领域做出了重大贡献,尤其是在预测癌症复发方面。这些见解极有可能为未来的研究和临床干预提供依据,从而提高乳腺癌患者复发预测的准确性和有效性。
期刊介绍:
As the official publication of the National Cancer Institute, Cairo University, the Journal of the Egyptian National Cancer Institute (JENCI) is an open access peer-reviewed journal that publishes on the latest innovations in oncology and thereby, providing academics and clinicians a leading research platform. JENCI welcomes submissions pertaining to all fields of basic, applied and clinical cancer research. Main topics of interest include: local and systemic anticancer therapy (with specific interest on applied cancer research from developing countries); experimental oncology; early cancer detection; randomized trials (including negatives ones); and key emerging fields of personalized medicine, such as molecular pathology, bioinformatics, and biotechnologies.