Michele Zanoletti, Filippo Ugolini, Laila El Bachiri, Valeria Pasini, Marco Laurino, Francesco De Logu, Eleonora Melissa, Carolina Marchi, Maria Colombino, Daniela Massi, Guido Rindi, Camilla Eva Comin, Giuseppe Palmieri, Antonio Cossu
{"title":"EGFR Mutation Detection in Whole Slide Images of Non-Small Cell Lung Cancers Using a Two-Stage Deep Transfer Learning Approach","authors":"Michele Zanoletti, Filippo Ugolini, Laila El Bachiri, Valeria Pasini, Marco Laurino, Francesco De Logu, Eleonora Melissa, Carolina Marchi, Maria Colombino, Daniela Massi, Guido Rindi, Camilla Eva Comin, Giuseppe Palmieri, Antonio Cossu","doi":"10.1002/cam4.71249","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Lung cancer (LC) is the leading cause of cancer death worldwide. Non-small cell lung cancer is the most frequent and includes adenocarcinoma and squamous cell carcinoma. Currently, LC treatment is based on tumor molecular profiling. LC may display Epidermal Growth Factor Receptor (EGFR) gene mutation. Detecting mutations in the EGFR gene is crucial for the tyrosine kinase inhibitory therapy.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>This study used a computer-based methodology with two Convolutional Neural Networks (CNNs) based on InceptionResNet-V2, applied to Whole Slide Images, to distinguish healthy from cancerous tissue and then EGFR mutated tumor tissue samples. We also integrated an Explainable AI technique (Grad-CAM) to clearly visualize insights into the model's decision-making process. The analysis was conducted on 259 LC cases collected from three different centers (Florence, Rome, and Sassari).</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>This methodology achieved an accuracy of 96.17% in distinguishing healthy from cancerous tissue, with specificity of 87.89%, sensitivity of 98.43%, an F1 score of 97.59% and an AUC of 0.99. Additionally, Cohen's Kappa indicated a consistency of 0.7982, and the confusion matrix showed a correct classification rate of 96.2%. For EGFR mutation detection in cancer tissue, slide-level performance after aggregation reached an accuracy of 76.67% with specificity of 80.77%, sensitivity of 73.53%, an F1 score of 78.12%, a consistency of 0.5583 of Cohen's Kappa and an AUC of 0.77. The confusion matrix showed 76.7% as a correct classification rate.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>The two tested CNNs showed potential for assisting LC diagnosis, especially in distinguishing healthy from tumor tissue. While the direct detection of EGFR mutational status remains challenging, the results suggest that relevant predictive signals can still be extracted from routine H&E slides.</p>\n </section>\n </div>","PeriodicalId":139,"journal":{"name":"Cancer Medicine","volume":"14 18","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12445122/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer Medicine","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cam4.71249","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Lung cancer (LC) is the leading cause of cancer death worldwide. Non-small cell lung cancer is the most frequent and includes adenocarcinoma and squamous cell carcinoma. Currently, LC treatment is based on tumor molecular profiling. LC may display Epidermal Growth Factor Receptor (EGFR) gene mutation. Detecting mutations in the EGFR gene is crucial for the tyrosine kinase inhibitory therapy.
Methods
This study used a computer-based methodology with two Convolutional Neural Networks (CNNs) based on InceptionResNet-V2, applied to Whole Slide Images, to distinguish healthy from cancerous tissue and then EGFR mutated tumor tissue samples. We also integrated an Explainable AI technique (Grad-CAM) to clearly visualize insights into the model's decision-making process. The analysis was conducted on 259 LC cases collected from three different centers (Florence, Rome, and Sassari).
Results
This methodology achieved an accuracy of 96.17% in distinguishing healthy from cancerous tissue, with specificity of 87.89%, sensitivity of 98.43%, an F1 score of 97.59% and an AUC of 0.99. Additionally, Cohen's Kappa indicated a consistency of 0.7982, and the confusion matrix showed a correct classification rate of 96.2%. For EGFR mutation detection in cancer tissue, slide-level performance after aggregation reached an accuracy of 76.67% with specificity of 80.77%, sensitivity of 73.53%, an F1 score of 78.12%, a consistency of 0.5583 of Cohen's Kappa and an AUC of 0.77. The confusion matrix showed 76.7% as a correct classification rate.
Conclusion
The two tested CNNs showed potential for assisting LC diagnosis, especially in distinguishing healthy from tumor tissue. While the direct detection of EGFR mutational status remains challenging, the results suggest that relevant predictive signals can still be extracted from routine H&E slides.
期刊介绍:
Cancer Medicine is a peer-reviewed, open access, interdisciplinary journal providing rapid publication of research from global biomedical researchers across the cancer sciences. The journal will consider submissions from all oncologic specialties, including, but not limited to, the following areas:
Clinical Cancer Research
Translational research ∙ clinical trials ∙ chemotherapy ∙ radiation therapy ∙ surgical therapy ∙ clinical observations ∙ clinical guidelines ∙ genetic consultation ∙ ethical considerations
Cancer Biology:
Molecular biology ∙ cellular biology ∙ molecular genetics ∙ genomics ∙ immunology ∙ epigenetics ∙ metabolic studies ∙ proteomics ∙ cytopathology ∙ carcinogenesis ∙ drug discovery and delivery.
Cancer Prevention:
Behavioral science ∙ psychosocial studies ∙ screening ∙ nutrition ∙ epidemiology and prevention ∙ community outreach.
Bioinformatics:
Gene expressions profiles ∙ gene regulation networks ∙ genome bioinformatics ∙ pathwayanalysis ∙ prognostic biomarkers.
Cancer Medicine publishes original research articles, systematic reviews, meta-analyses, and research methods papers, along with invited editorials and commentaries. Original research papers must report well-conducted research with conclusions supported by the data presented in the paper.