Comparison of Machine Learning Models for Classification of Breast Cancer Risk Based on Clinical Data

IF 1.5 Q4 ONCOLOGY

Cancer reports Pub Date : 2025-04-02 DOI:10.1002/cnr2.70175

Haniyeh Rafiepoor, Alireza Ghorbankhanloo, Kazem Zendehdel, Zahra Zangeneh Madar, Sepideh Hajivalizadeh, Zeinab Hasani, Ali Sarmadi, Behzad Amanpour-Gharaei, Mohammad Amin Barati, Mozafar Saadat, Seyed-Ali Sadegh-Zadeh, Saeid Amanpour

{"title":"Comparison of Machine Learning Models for Classification of Breast Cancer Risk Based on Clinical Data","authors":"Haniyeh Rafiepoor, Alireza Ghorbankhanloo, Kazem Zendehdel, Zahra Zangeneh Madar, Sepideh Hajivalizadeh, Zeinab Hasani, Ali Sarmadi, Behzad Amanpour-Gharaei, Mohammad Amin Barati, Mozafar Saadat, Seyed-Ali Sadegh-Zadeh, Saeid Amanpour","doi":"10.1002/cnr2.70175","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Breast cancer (BC) is a major global health concern with rising incidence and mortality rates in many developing countries. Effective BC risk assessment models are crucial for prevention and early detection. While the Gail model, a traditional logistic regression-based model, has been broadly used, its predictive performance may be limited by its linear assumptions. With the rapid advancement of artificial intelligence (AI) in medical sciences, various complex machine learning algorithms have been developed for risk prediction, including for BC.</p>\n </section>\n \n <section>\n \n <h3> Aims</h3>\n \n <p>This study aims to compare the quality of AI-based models with the traditional Gail model in assessing BC risk using a population dataset. It also evaluates the performance of these models in predicting BC risk.</p>\n </section>\n \n <section>\n \n <h3> Methods and Results</h3>\n \n <p>This study involved 942 newly diagnosed BC patients and 975 healthy controls at the Cancer Institute in IKH hospital Complex, Tehran. Ten classification algorithms were applied to the dataset. The accuracy, sensitivity, precision, and feature importance in the machine learning algorithms were assessed and compared to previous studies for evaluation. The study found that AI algorithms alone did not significantly improve predictability compared to the Gail model. However, the importance of variables varied significantly among the AI algorithms. Understanding feature importance and interactions is crucial in AI modeling in order to enhance accuracy and identify critical risk factors.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>This study concluded that, in BC risk prediction, incorporating specific risk factors, such as genetic and image-related variables, may be necessary to further enhance accuracy in BC risk prediction models. Furthermore, it is crucial to address modeling issues in models with a restricted number of features for future research.</p>\n </section>\n </div>","PeriodicalId":9440,"journal":{"name":"Cancer reports","volume":"8 4","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cnr2.70175","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer reports","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cnr2.70175","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Breast cancer (BC) is a major global health concern with rising incidence and mortality rates in many developing countries. Effective BC risk assessment models are crucial for prevention and early detection. While the Gail model, a traditional logistic regression-based model, has been broadly used, its predictive performance may be limited by its linear assumptions. With the rapid advancement of artificial intelligence (AI) in medical sciences, various complex machine learning algorithms have been developed for risk prediction, including for BC.

Aims

This study aims to compare the quality of AI-based models with the traditional Gail model in assessing BC risk using a population dataset. It also evaluates the performance of these models in predicting BC risk.

Methods and Results

This study involved 942 newly diagnosed BC patients and 975 healthy controls at the Cancer Institute in IKH hospital Complex, Tehran. Ten classification algorithms were applied to the dataset. The accuracy, sensitivity, precision, and feature importance in the machine learning algorithms were assessed and compared to previous studies for evaluation. The study found that AI algorithms alone did not significantly improve predictability compared to the Gail model. However, the importance of variables varied significantly among the AI algorithms. Understanding feature importance and interactions is crucial in AI modeling in order to enhance accuracy and identify critical risk factors.

Conclusion

This study concluded that, in BC risk prediction, incorporating specific risk factors, such as genetic and image-related variables, may be necessary to further enhance accuracy in BC risk prediction models. Furthermore, it is crucial to address modeling issues in models with a restricted number of features for future research.

Abstract Image

查看原文本刊更多论文

基于临床数据的乳腺癌风险分类的机器学习模型比较

背景乳腺癌（BC）是全球关注的重大健康问题，在许多发展中国家的发病率和死亡率不断上升。有效的乳腺癌风险评估模型对于预防和早期检测至关重要。虽然基于传统逻辑回归的 Gail 模型已被广泛使用，但其预测性能可能因其线性假设而受到限制。随着人工智能（AI）在医学科学领域的快速发展，各种复杂的机器学习算法已被开发用于风险预测，包括对乳腺癌的预测。目的本研究旨在比较基于人工智能的模型与传统的盖尔模型在使用人群数据集评估 BC 风险方面的质量。研究还评估了这些模型在预测 BC 风险方面的性能。方法和结果本研究涉及德黑兰 IKH 综合医院癌症研究所的 942 名新诊断 BC 患者和 975 名健康对照者。数据集采用了十种分类算法。对机器学习算法的准确度、灵敏度、精确度和特征重要性进行了评估，并与之前的研究进行了比较。研究发现，与盖尔模型相比，单靠人工智能算法并不能显著提高预测能力。但是，不同人工智能算法中变量的重要性差异很大。为了提高准确性并识别关键风险因素，了解特征的重要性和相互作用对于人工智能建模至关重要。结论本研究认为，在 BC 风险预测中，要进一步提高 BC 风险预测模型的准确性，可能需要纳入特定的风险因素，如遗传和图像相关变量。此外，解决特征数量有限的模型中的建模问题也是未来研究的关键。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊