Manon Chossegros , François Delhommeau , Daniel Stockholm , Xavier Tannier
{"title":"Improving the generalizability of white blood cell classification with few-shot domain adaptation","authors":"Manon Chossegros , François Delhommeau , Daniel Stockholm , Xavier Tannier","doi":"10.1016/j.jpi.2024.100405","DOIUrl":"10.1016/j.jpi.2024.100405","url":null,"abstract":"<div><div>The morphological classification of nucleated blood cells is fundamental for the diagnosis of hematological diseases. Many Deep Learning algorithms have been implemented to automatize this classification task, but most of the time they fail to classify images coming from different sources. This is known as “domain shift”. Whereas some research has been conducted in this area, domain adaptation techniques are often computationally expensive and can introduce significant modifications to initial cell images. In this article, we propose an easy-to-implement workflow where we trained a model to classify images from two datasets, and tested it on images coming from eight other datasets. An EfficientNet model was trained on a source dataset comprising images from two different datasets. It was afterwards fine-tuned on each of the eight target datasets by using 100 or less-annotated images from these datasets. Images from both the source and the target dataset underwent a color transform to put them into a standardized color style. The importance of color transform and fine-tuning was evaluated through an ablation study and visually assessed with scatter plots, and an extensive error analysis was carried out. The model achieved an accuracy higher than 80% for every dataset and exceeded 90% for more than half of the datasets. The presented workflow yielded promising results in terms of generalizability, significantly improving performance on target datasets, whereas keeping low computational cost and maintaining consistent color transformations. Source code is available at: <span><span>https://github.com/mc2295/WBC_Generalization</span><svg><path></path></svg></span></div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100405"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pathology Informatics Summit 2024 Abstracts Ann Arbor Marriott at Eagle Crest Resort May 20-23, 2024 Ann Arbor, Michigan","authors":"","doi":"10.1016/j.jpi.2024.100392","DOIUrl":"10.1016/j.jpi.2024.100392","url":null,"abstract":"","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100392"},"PeriodicalIF":0.0,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142697926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Silvia Varricchio, Gennaro Ilardi, Daniela Russo, Rosa Maria Di Crescenzo, Angela Crispino, Stefania Staibano, Francesco Merolla
{"title":"Leveraging deep learning for identification and segmentation of \"CAF-1/p60-positive\" nuclei in oral squamous cell carcinoma tissue samples.","authors":"Silvia Varricchio, Gennaro Ilardi, Daniela Russo, Rosa Maria Di Crescenzo, Angela Crispino, Stefania Staibano, Francesco Merolla","doi":"10.1016/j.jpi.2024.100407","DOIUrl":"https://doi.org/10.1016/j.jpi.2024.100407","url":null,"abstract":"<p><p>In the current study, we introduced a unique method for identifying and segmenting oral squamous cell carcinoma (OSCC) nuclei, concentrating on those predicted to have significant CAF-1/p60 protein expression. Our suggested model uses the StarDist architecture, a deep-learning framework designed for biomedical image segmentation tasks. The training dataset comprises painstakingly annotated masks created from tissue sections previously stained with hematoxylin and eosin (H&E) and then restained with immunohistochemistry (IHC) for p60 protein. Our algorithm uses subtle morphological and colorimetric H&E cellular characteristics to predict CAF-1/p60 IHC expression in OSCC nuclei. The StarDist-based architecture performs exceptionally well in localizing and segmenting H&E nuclei, previously identified by IHC-based ground truth. In summary, our innovative approach harnesses deep learning and multimodal information to advance the automated analysis of OSCC nuclei exhibiting specific protein expression patterns. This methodology holds promise for expediting accurate pathological assessment and gaining deeper insights into the role of CAF-1/p60 protein within the context of oral cancer progression.</p>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"100407"},"PeriodicalIF":0.0,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11653155/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142855784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marina Aweeda , Carly Fassler , Alexander N. Perez , Alexis Miller , Kavita Prasad , Kayvon F. Sharif , James S. Lewis Jr , Kim A. Ely , Mitra Mehrad , Sarah L. Rohde , Alexander J. Langerman , Kyle Mannion , Robert J. Sinard , James L. Netterville , Eben L. Rosenthal , Michael C. Topf
{"title":"Visual pathology reports for communication of final margin status in laryngeal cancer surgery","authors":"Marina Aweeda , Carly Fassler , Alexander N. Perez , Alexis Miller , Kavita Prasad , Kayvon F. Sharif , James S. Lewis Jr , Kim A. Ely , Mitra Mehrad , Sarah L. Rohde , Alexander J. Langerman , Kyle Mannion , Robert J. Sinard , James L. Netterville , Eben L. Rosenthal , Michael C. Topf","doi":"10.1016/j.jpi.2024.100404","DOIUrl":"10.1016/j.jpi.2024.100404","url":null,"abstract":"<div><h3>Background</h3><div>Positive margins are frequently observed in total laryngectomy (TL) specimens. Effective communication of margin sampling sites and final margin status between surgeons and pathologists is crucial. In this study, we evaluate the utility of multimedia visual pathology reports to facilitate interdisciplinary discussion of margin status in laryngeal cancer surgery.</div></div><div><h3>Methods</h3><div>Ex vivo laryngeal cancer surgical specimens were three-dimensional (3D) scanned before standard of care pathological analysis. Using computer-aided design software, the 3D model was annotated to reflect inking, sectioning, and margin sampling sites, generating a visual pathology report. These reports were distributed to head and neck surgeons and pathologists postoperatively.</div></div><div><h3>Results</h3><div>Fifteen laryngeal cancer surgical specimens were 3D scanned and virtually annotated from January 2022 to December 2023. Most specimens (73.3%) were squamous cell carcinomas (SCCs). Among the cases, 26.7% had final positive surgical margins, whereas 13.3% had close margins, defined as <5 mm. The visual pathology report demonstrated sites of close or positive margins on the 3D specimens and was used to facilitate postoperative communication between surgeons and pathologists in 85.7% of these cases. Visual pathology reports were presented in multidisciplinary tumor board discussions (20%), email correspondences (13.3%), and teleconferences (6.7%), and were referenced in the final written pathology reports (26.7%).</div></div><div><h3>Conclusions</h3><div>3D scanning and virtual annotation of laryngeal cancer specimens for the creation of visual pathology reports is an innovative approach for postoperative pathology documentation, margin analysis, and surgeon–pathologist communication.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100404"},"PeriodicalIF":0.0,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142697925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nils Englert , Constantin Schwab , Maximilian Legnar , Cleo-Aron Weis
{"title":"Presenting the framework of the whole slide image file Babel fish: An OCR-based file labeling tool","authors":"Nils Englert , Constantin Schwab , Maximilian Legnar , Cleo-Aron Weis","doi":"10.1016/j.jpi.2024.100402","DOIUrl":"10.1016/j.jpi.2024.100402","url":null,"abstract":"<div><h3>Introduction</h3><div>Metadata extraction from digitized slides or whole slide image files is a frequent, laborious, and tedious task. In this work, we present a tool to automatically extract all relevant slide information, such as case number, year, slide number, block number, and staining from the macro-images of the scanned slide.</div><div>We named the tool Babel fish as it helps translate relevant information printed on the slide. It is written to contain certain basic assumptions regarding, for example, the location of certain information. This can be adapted to the respective location. The extracted metadata can then be used to sort digital slides into databases or to link them with associated case IDs from laboratory information systems.</div></div><div><h3>Material and methods</h3><div>The tool is based on optical character recognition (OCR). For most information, the easyOCR tool is used. For the block number and cases with insufficient results in the first OCR round, a second OCR with pytesseract is applied.</div><div>Two datasets are used: one for tool development has 342 slides; and another for one for testing has 110 slides.</div></div><div><h3>Results</h3><div>For the testing set, the overall accuracy for retrieving all relevant information per slide is 0.982. Of note, the accuracy for most information parts is 1.000, whereas the accuracy for the block number detection is 0.982.</div></div><div><h3>Conclusion</h3><div>The Babel fish tool can be used to rename vast amounts of whole slide image files in an image analysis pipeline. Furthermore, it could be an essential part of DICOM conversion pipelines, as it extracts relevant metadata like case number, year, block ID, and staining.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100402"},"PeriodicalIF":0.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142697924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martim Afonso , Praphulla M.S. Bhawsar , Monjoy Saha , Jonas S. Almeida , Arlindo L. Oliveira
{"title":"Multiple Instance Learning for WSI: A comparative analysis of attention-based approaches","authors":"Martim Afonso , Praphulla M.S. Bhawsar , Monjoy Saha , Jonas S. Almeida , Arlindo L. Oliveira","doi":"10.1016/j.jpi.2024.100403","DOIUrl":"10.1016/j.jpi.2024.100403","url":null,"abstract":"<div><div>Whole slide images (WSI), obtained by high-resolution digital scanning of microscope slides at multiple scales, are the cornerstone of modern Digital Pathology. However, they represent a particular challenge to artificial intelligence (AI)-based/AI-mediated analysis because pathology labeling is typically done at slide-level, instead of tile-level. It is not just that medical diagnostics is recorded at the specimen level, the detection of oncogene mutation is also experimentally obtained, and recorded by initiatives like The Cancer Genome Atlas (TCGA), at the slide level. This configures a dual challenge: (a) accurately predicting the overall cancer phenotype and (b) finding out what cellular morphologies are associated with it at the tile level. To better understand and address these challenges, two existing weakly supervised Multiple Instance Learning (MIL) approaches were explored and compared: Attention MIL (AMIL) and Additive MIL (AdMIL). These architectures were analyzed on tumor detection (a task where these models obtained good results previously) and TP53 mutation detection (a much less explored task). For tumor detection, we built a dataset from Lung Squamous Cell Carcinoma (TCGA-LUSC) slides, with 349 positive and 349 negative slides. The patches were extracted from 5× magnification. For TP53 mutation detection, we explored a dataset built from Invasive Breast Carcinoma (TCGA-BRCA) slides, with 347 positive and 347 negative slides. In this case, we explored three different magnification levels: 5×, 10×, and 20×. Our results show that a modified additive implementation of MIL matched the performance of reference implementation (AUC 0.96), and was only slightly outperformed by AMIL (AUC 0.97) on the tumor detection task. TP53 mutation was most sensitive to features at the higher applications where cellular morphology is resolved. More interestingly from the perspective of the molecular pathologist, we highlight the possible ability of these MIL architectures to identify distinct sensitivities to morphological features (through the detection of regions of interest, ROIs) at different amplification levels. This ability for models to obtain tile-level ROIs is very appealing to pathologists as it provides the possibility for these algorithms to be integrated in a digital staining application for analysis, facilitating the navigation through these high-dimensional images and the diagnostic process.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100403"},"PeriodicalIF":0.0,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas W. Bauer , Matthew G. Hanna , Kelly D. Smith , S. Joseph Sirintrapun , Meera R. Hameed , Deepti Reddi , Bernard S. Chang , Orly Ardon , Xiaozhi Zhou , Jenny V. Lewis , Shubham Dayal , Joseph Chiweshe , David Ferber , Aysegul Ergin Sutcu , Michael White
{"title":"A multicenter study to evaluate the analytical precision by pathologists using the Aperio GT 450 DX","authors":"Thomas W. Bauer , Matthew G. Hanna , Kelly D. Smith , S. Joseph Sirintrapun , Meera R. Hameed , Deepti Reddi , Bernard S. Chang , Orly Ardon , Xiaozhi Zhou , Jenny V. Lewis , Shubham Dayal , Joseph Chiweshe , David Ferber , Aysegul Ergin Sutcu , Michael White","doi":"10.1016/j.jpi.2024.100401","DOIUrl":"10.1016/j.jpi.2024.100401","url":null,"abstract":"<div><h3>Background</h3><div>Digital pathology systems (DPS) are emerging as capable technologies for clinical practice. Studies have analyzed pathologists' diagnostic concordance by comparing reviews of whole slide images (WSIs) to glass slides (e.g., accuracy). This observational study evaluated the reproducibility of pathologists' diagnostic reviews using the Aperio GT 450 DX under slightly different conditions (precision).</div></div><div><h3>Method</h3><div>Diagnostic precision was tested in three conditions: intra-system (within systems), inter-system/site (between systems/sites), and intra- and inter-pathologist (within and between pathologists). A total of five study/reading pathologists (one pathologist each for intra-system, inter-system/site, and three for intra-pathologist/inter-pathologist analyses) were assigned to the respective sub-studies.</div><div>A panel of 69 glass slides with 23 unique histological features was used to evaluate the WSI system's precision. Each glass slide was scanned to generate a unique WSI. From each WSI, the field of view (FOV) was generated (at least 2 FOVs/WSI), which included the selected features (1–3 features/FOV). Each pathologist reviewed the digital slides and identified which morphological features, if any, were present in each defined FOV. To minimize recall bias, an additional 12 wild card slides from different organ types were used for which FOVs were extracted. The pathologists also read these wild card slides FOVs; however, the corresponding feature identification was not included in the final data analysis.</div></div><div><h3>Results</h3><div>Each measured endpoint met the pre-defined acceptance criteria of the lower bound of the 95% confidence interval (CI) overall agreement (OA) rate being ≥85% for each sub-study. The lower bound of the 95% CI for the intra-system OA rate was 95.8%; for inter-system analysis, it was 94.9%; for intra-pathologist analysis, 92.4%, whereas for inter-pathologist analyses, the lower bound of the 95% CI of the OA was 90.6%.</div></div><div><h3>Conclusion</h3><div>The study results indicate that pathologists using the Aperio GT 450 DX WSI system can precisely identify histological features that may be required for accurately diagnosing anatomic pathology cases.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100401"},"PeriodicalIF":0.0,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enzo Gallo , Davide Guardiani , Martina Betti , Brindusa Ana Maria Arteni , Simona Di Martino , Sara Baldinelli , Theodora Daralioti , Elisabetta Merenda , Andrea Ascione , Paolo Visca , Edoardo Pescarmona , Marialuisa Lavitrano , Paola Nisticò , Gennaro Ciliberto , Matteo Pallocca
{"title":"AI drives the assessment of lung cancer microenvironment composition","authors":"Enzo Gallo , Davide Guardiani , Martina Betti , Brindusa Ana Maria Arteni , Simona Di Martino , Sara Baldinelli , Theodora Daralioti , Elisabetta Merenda , Andrea Ascione , Paolo Visca , Edoardo Pescarmona , Marialuisa Lavitrano , Paola Nisticò , Gennaro Ciliberto , Matteo Pallocca","doi":"10.1016/j.jpi.2024.100400","DOIUrl":"10.1016/j.jpi.2024.100400","url":null,"abstract":"<div><h3>Purpose</h3><div>The abundance and distribution of tumor-infiltrating lymphocytes (TILs) as well as that of other components of the tumor microenvironment is of particular importance for predicting response to immunotherapy in lung cancer (LC). We describe here a pilot study employing artificial intelligence (AI) in the assessment of TILs and other cell populations, intending to reduce the inter- or intra-observer variability that commonly characterizes this evaluation.</div></div><div><h3>Design</h3><div>We developed a machine learning-based classifier to detect tumor, immune, and stromal cells on hematoxylin and eosin-stained sections, using the open-source framework <em>QuPath</em>. We evaluated the quantity of the aforementioned three cell populations among 37 LC whole slide images regions of interest, comparing the assessments made by five pathologists, both before and after using graphical predictions made by AI, for a total of 1110 quantitative measurements.</div></div><div><h3>Results</h3><div>Our findings indicate noteworthy variations in score distribution among pathologists and between individual pathologists and AI. The AI-guided pathologist's evaluations resulted in reduction of significant discrepancies across pathologists: three comparisons showed a loss of significance (<em>p</em> > 0.05), whereas other four showed a reduction in significance (<em>p</em> > 0.01).</div></div><div><h3>Conclusions</h3><div>We show that employing a machine learning approach in cell population quantification reduces inter- and intra-observer variability, improving reproducibility and facilitating its use in further validation studies.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100400"},"PeriodicalIF":0.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11513621/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carly Fassler , Marina Aweeda , Alexander N. Perez , Yuna Chung , Spencer Yueh , Robert J. Sinard , Sarah L. Rohde , Kyle Mannion , Alexander J. Langerman , Eben L. Rosenthal , Jie Ying Wu , Mitra Mehrad , Kim Ely , James S. Lewis Jr , Michael C. Topf
{"title":"Digital mapping of resected cancer specimens: The visual pathology report","authors":"Carly Fassler , Marina Aweeda , Alexander N. Perez , Yuna Chung , Spencer Yueh , Robert J. Sinard , Sarah L. Rohde , Kyle Mannion , Alexander J. Langerman , Eben L. Rosenthal , Jie Ying Wu , Mitra Mehrad , Kim Ely , James S. Lewis Jr , Michael C. Topf","doi":"10.1016/j.jpi.2024.100399","DOIUrl":"10.1016/j.jpi.2024.100399","url":null,"abstract":"<div><h3>Background</h3><div>The current standard-of-care pathology report relies only on lengthy written text descriptions without a visual representation of the resected cancer specimen. This study demonstrates the feasibility of incorporating virtual, three-dimensional (3D) visual pathology reports to improve communication of final pathology reporting.</div></div><div><h3>Materials and methods</h3><div>Surgical specimens are 3D scanned and virtually mapped alongside the pathology team to replicate grossing. The 3D specimen maps are incorporated into a hybrid visual pathology report which displays the resected specimen and sampled margins alongside gross measurements, tumor characteristics, and microscopic diagnoses.</div></div><div><h3>Results</h3><div>Visual pathology reports were created for 10 head and neck cancer cases. Each report concisely communicated information from the final pathology report in a single page and contained significantly fewer words (293.4 words) than standard written pathology reports (850.1 words, <em>p</em> < 0.01).</div></div><div><h3>Conclusions</h3><div>We establish the feasibility of a novel visual pathology report that includes an annotated visual model of the resected cancer specimen in place of lengthy written text of standard of care head and neck cancer pathology reports.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100399"},"PeriodicalIF":0.0,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142432514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A precise machine learning model: Detecting cervical cancer using feature selection and explainable AI","authors":"Rashiduzzaman Shakil, Sadia Islam, Bonna Akter","doi":"10.1016/j.jpi.2024.100398","DOIUrl":"10.1016/j.jpi.2024.100398","url":null,"abstract":"<div><div>Cervical cancer is a cancer that remains a significant global health challenge all over the world. Due to improper screening in the early stages, and healthcare disparities, a large number of women are suffering from this disease, and the mortality rate increases day by day. Hence, in these studies, we presented a precise approach utilizing six different machine learning models (decision tree, logistic regression, naïve bayes, random forest, k nearest neighbors, support vector machine), which can predict the early stage of cervical cancer by analysing 36 risk factor attributes of 858 individuals. In addition, two data balancing techniques—Synthetic Minority Oversampling Technique and Adaptive Synthetic Sampling—were used to mitigate the data imbalance issues. Furthermore, Chi-square and Least Absolute Shrinkage and Selection Operator are two distinct feature selection processes that have been applied to evaluate the feature rank, which are mostly correlated to identify the particular disease, and also integrate an explainable artificial intelligence technique, namely Shapley Additive Explanations, for clarifying the model outcome. The applied machine learning model outcome is evaluated by performance evaluation matrices, namely accuracy, sensitivity, specificity, precision, f1-score, false-positive rate and false-negative rate, and area under the Receiver operating characteristic curve score. The decision tree outperformed in Chi-square feature selection with outstanding accuracy with 97.60%, 98.73% sensitivity, 80% specificity, and 98.73% precision, respectively. During the data imbalance, DT performed 97% accuracy, 99.35% sensitivity, 69.23% specificity, and 97.45% precision. This research is focused on developing diagnostic frameworks with automated tools to improve the detection and management of cervical cancer, as well as on helping healthcare professionals deliver more efficient and personalized care to their patients.</div></div>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"15 ","pages":"Article 100398"},"PeriodicalIF":0.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}