Application of artificial intelligence in cervical cancer diagnosis using risk factors: A systematic review

IF 4.7

Telematics and Informatics Reports Pub Date : 2025-09-18 DOI:10.1016/j.teler.2025.100250

Tabu S. Kondo, Daniel Ngondya, Hamim Rusheke

{"title":"Application of artificial intelligence in cervical cancer diagnosis using risk factors: A systematic review","authors":"Tabu S. Kondo, Daniel Ngondya, Hamim Rusheke","doi":"10.1016/j.teler.2025.100250","DOIUrl":null,"url":null,"abstract":"<div><div>Timely screening of cervical cancer enhances treatment efficacy. However, conventional screening approaches are intrusive and inaccessible to women, especially in resource-constrained settings. While applying machine learning in cervical cancer diagnosis has the potential to enhance screening rates, privacy and inclusion, results from existing works indicate a wide disparity in approaches. In this work, a systematic review of literature was conducted to highlight gaps in literature in relation to the application of machine learning for cervical cancer diagnosis from risk factors. Existing reviews on cervical cancer diagnosis have focused on image datasets and have only considered suitable Machine Learning algorithms, their performance, and features in the datasets used. Little emphasis has been paid to data preprocessing, model implementation and usability testing. In this work, four scholarly databases, namely Scopus, ScienceDirect, PubMed, and BioMedCentral (BMC), were queried using a combination of relevant keywords. Twenty-seven (27) original journal articles written in English and published between January 2014 and January 2024 were retrieved and included in the study. Results indicate that 88.9% of the studied works have used a single dataset-pointing to data sharing challenges. Only one work (3.7%) has done comprehensive data preprocessing; the rest have done partial or no data preprocessing. While Sub-Saharan Africa bears the largest cervical cancer burden, it has shown minimal involvement in cervical cancer diagnosis using machine learning, with no collaboration among experts and countries. Works have substantially focused on the performance of machine learning models, with the top 5 commonly used algorithms being Decision Tree, Support Vector Machine, Random Forest, and Logistic Regression. The implementation of the models and the assessment of the usability and acceptance of the resulting applications, however, have been neglected in the works. Policies on Machine Learning based disease diagnosis tools should emphasize diversity, equity, and inclusivity on dataset creation, comprehensive and standardized data preprocessing pipelines and prioritize human-centered design, usability testing, and clinical validation to ensure solutions are reliable and acceptable by medical professionals and relevant stakeholders.</div></div>","PeriodicalId":101213,"journal":{"name":"Telematics and Informatics Reports","volume":"20 ","pages":"Article 100250"},"PeriodicalIF":4.7000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Telematics and Informatics Reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772503025000647","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Timely screening of cervical cancer enhances treatment efficacy. However, conventional screening approaches are intrusive and inaccessible to women, especially in resource-constrained settings. While applying machine learning in cervical cancer diagnosis has the potential to enhance screening rates, privacy and inclusion, results from existing works indicate a wide disparity in approaches. In this work, a systematic review of literature was conducted to highlight gaps in literature in relation to the application of machine learning for cervical cancer diagnosis from risk factors. Existing reviews on cervical cancer diagnosis have focused on image datasets and have only considered suitable Machine Learning algorithms, their performance, and features in the datasets used. Little emphasis has been paid to data preprocessing, model implementation and usability testing. In this work, four scholarly databases, namely Scopus, ScienceDirect, PubMed, and BioMedCentral (BMC), were queried using a combination of relevant keywords. Twenty-seven (27) original journal articles written in English and published between January 2014 and January 2024 were retrieved and included in the study. Results indicate that 88.9% of the studied works have used a single dataset-pointing to data sharing challenges. Only one work (3.7%) has done comprehensive data preprocessing; the rest have done partial or no data preprocessing. While Sub-Saharan Africa bears the largest cervical cancer burden, it has shown minimal involvement in cervical cancer diagnosis using machine learning, with no collaboration among experts and countries. Works have substantially focused on the performance of machine learning models, with the top 5 commonly used algorithms being Decision Tree, Support Vector Machine, Random Forest, and Logistic Regression. The implementation of the models and the assessment of the usability and acceptance of the resulting applications, however, have been neglected in the works. Policies on Machine Learning based disease diagnosis tools should emphasize diversity, equity, and inclusivity on dataset creation, comprehensive and standardized data preprocessing pipelines and prioritize human-centered design, usability testing, and clinical validation to ensure solutions are reliable and acceptable by medical professionals and relevant stakeholders.

查看原文本刊更多论文

人工智能在宫颈癌危险因素诊断中的应用综述

适时的子宫颈癌筛检可提高治疗效果。然而，传统的筛查方法对妇女来说是侵入性的和难以获得的，特别是在资源有限的情况下。虽然将机器学习应用于宫颈癌诊断有可能提高筛查率、隐私和包容性，但现有工作的结果表明，在方法上存在很大差异。在这项工作中，对文献进行了系统的回顾，以突出与机器学习在宫颈癌危险因素诊断中的应用相关的文献空白。现有的关于宫颈癌诊断的综述主要集中在图像数据集上，并且只考虑了合适的机器学习算法、它们的性能和所使用数据集的特征。对数据预处理、模型实现和可用性测试的关注较少。本研究使用相关关键词组合查询Scopus、ScienceDirect、PubMed和BioMedCentral （BMC）四个学术数据库。检索到2014年1月至2024年1月间发表的27篇英文期刊原创文章，并将其纳入本研究。结果表明，88.9%的研究工作使用了单一数据集，这表明数据共享存在挑战。只有一部作品（3.7%）做了全面的数据预处理；其余的则只进行了部分数据预处理或没有进行数据预处理。虽然撒哈拉以南非洲承担着最大的宫颈癌负担，但在专家和国家之间没有合作的情况下，它很少参与使用机器学习进行宫颈癌诊断。研究工作主要集中在机器学习模型的性能上，最常用的5种算法是决策树、支持向量机、随机森林和逻辑回归。然而，模型的实现和可用性的评估以及最终应用程序的接受度在工作中被忽视了。基于机器学习的疾病诊断工具政策应强调数据集创建的多样性、公平性和包容性，全面和标准化的数据预处理管道，优先考虑以人为本的设计、可用性测试和临床验证，以确保解决方案可靠并被医疗专业人员和相关利益相关者接受。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Telematics and Informatics Reports

CiteScore

1.90

自引率

0.00%

发文量