{"title":"Application of artificial intelligence in cervical cancer diagnosis using risk factors: A systematic review","authors":"Tabu S. Kondo, Daniel Ngondya, Hamim Rusheke","doi":"10.1016/j.teler.2025.100250","DOIUrl":null,"url":null,"abstract":"<div><div>Timely screening of cervical cancer enhances treatment efficacy. However, conventional screening approaches are intrusive and inaccessible to women, especially in resource-constrained settings. While applying machine learning in cervical cancer diagnosis has the potential to enhance screening rates, privacy and inclusion, results from existing works indicate a wide disparity in approaches. In this work, a systematic review of literature was conducted to highlight gaps in literature in relation to the application of machine learning for cervical cancer diagnosis from risk factors. Existing reviews on cervical cancer diagnosis have focused on image datasets and have only considered suitable Machine Learning algorithms, their performance, and features in the datasets used. Little emphasis has been paid to data preprocessing, model implementation and usability testing. In this work, four scholarly databases, namely Scopus, ScienceDirect, PubMed, and BioMedCentral (BMC), were queried using a combination of relevant keywords. Twenty-seven (27) original journal articles written in English and published between January 2014 and January 2024 were retrieved and included in the study. Results indicate that 88.9% of the studied works have used a single dataset-pointing to data sharing challenges. Only one work (3.7%) has done comprehensive data preprocessing; the rest have done partial or no data preprocessing. While Sub-Saharan Africa bears the largest cervical cancer burden, it has shown minimal involvement in cervical cancer diagnosis using machine learning, with no collaboration among experts and countries. Works have substantially focused on the performance of machine learning models, with the top 5 commonly used algorithms being Decision Tree, Support Vector Machine, Random Forest, and Logistic Regression. The implementation of the models and the assessment of the usability and acceptance of the resulting applications, however, have been neglected in the works. Policies on Machine Learning based disease diagnosis tools should emphasize diversity, equity, and inclusivity on dataset creation, comprehensive and standardized data preprocessing pipelines and prioritize human-centered design, usability testing, and clinical validation to ensure solutions are reliable and acceptable by medical professionals and relevant stakeholders.</div></div>","PeriodicalId":101213,"journal":{"name":"Telematics and Informatics Reports","volume":"20 ","pages":"Article 100250"},"PeriodicalIF":4.7000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Telematics and Informatics Reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772503025000647","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Timely screening of cervical cancer enhances treatment efficacy. However, conventional screening approaches are intrusive and inaccessible to women, especially in resource-constrained settings. While applying machine learning in cervical cancer diagnosis has the potential to enhance screening rates, privacy and inclusion, results from existing works indicate a wide disparity in approaches. In this work, a systematic review of literature was conducted to highlight gaps in literature in relation to the application of machine learning for cervical cancer diagnosis from risk factors. Existing reviews on cervical cancer diagnosis have focused on image datasets and have only considered suitable Machine Learning algorithms, their performance, and features in the datasets used. Little emphasis has been paid to data preprocessing, model implementation and usability testing. In this work, four scholarly databases, namely Scopus, ScienceDirect, PubMed, and BioMedCentral (BMC), were queried using a combination of relevant keywords. Twenty-seven (27) original journal articles written in English and published between January 2014 and January 2024 were retrieved and included in the study. Results indicate that 88.9% of the studied works have used a single dataset-pointing to data sharing challenges. Only one work (3.7%) has done comprehensive data preprocessing; the rest have done partial or no data preprocessing. While Sub-Saharan Africa bears the largest cervical cancer burden, it has shown minimal involvement in cervical cancer diagnosis using machine learning, with no collaboration among experts and countries. Works have substantially focused on the performance of machine learning models, with the top 5 commonly used algorithms being Decision Tree, Support Vector Machine, Random Forest, and Logistic Regression. The implementation of the models and the assessment of the usability and acceptance of the resulting applications, however, have been neglected in the works. Policies on Machine Learning based disease diagnosis tools should emphasize diversity, equity, and inclusivity on dataset creation, comprehensive and standardized data preprocessing pipelines and prioritize human-centered design, usability testing, and clinical validation to ensure solutions are reliable and acceptable by medical professionals and relevant stakeholders.