Ethar Alzaid, Gabriele Pergola, Harriet Evans, David Snead, Fayyaz Minhas
{"title":"Large multimodal model-based standardisation of pathology reports with confidence and its prognostic significance","authors":"Ethar Alzaid, Gabriele Pergola, Harriet Evans, David Snead, Fayyaz Minhas","doi":"10.1002/2056-4538.70010","DOIUrl":"10.1002/2056-4538.70010","url":null,"abstract":"<p>Despite the existence of established standards and guidelines for pathology reporting, many pathology reports are still written in unstructured free text. Extracting information from these reports and formatting it according to a standard is crucial for consistent interpretation. Automated information extraction from unstructured pathology reports is a challenging task, as it requires accurately interpreting medical terminologies and context-dependent details. In this work, we present a practical approach for automatically extracting information from unstructured pathology reports or scanned paper reports utilising a large multimodal model. This framework uses context-aware prompting strategies to extract values of individual fields, such as grade, size, etc. from pathology reports. A unique feature of the proposed approach is that it assigns a confidence value indicating the correctness of the model's extraction for each field and generates a structured report in line with national pathology guidelines in human and machine-readable formats. We have analysed the extraction performance in terms of accuracy and kappa scores, and the quality of the confidence scores assigned by the model. We have also evaluated the prognostic value of the extracted fields and feature embeddings of the raw text. Results showed that the model can accurately extract information with an accuracy and kappa score up to 0.99 and 0.98, respectively. Our results indicate that confidence scores are an effective indicator of the correctness of the extracted information achieving an area under the receiver operating characteristic curve up to 0.93 thus enabling automatic flagging of extraction errors. Our analysis further reveals that, as expected, information extracted from pathology reports is highly prognostically relevant. The framework demo is available at: https://labieb.dcs.warwick.ac.uk/. Information extracted from pathology reports of colorectal cancer cases in the cancer genome atlas using the proposed approach and its code are available at: https://github.com/EtharZaid/Labieb.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"10 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/2056-4538.70010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High chromosomal instability is associated with higher 10-year risks of recurrence for hormone receptor-positive, human epidermal growth factor receptor 2-negative breast cancer patients: clinical evidence from a large-scale, multiple-site, retrospective study","authors":"Yu-Yang Liao, Jianfei Fu, Xiang Lu, Ziliang Qian, Yang Yu, Liang Zhu, Jia-Ni Pan, Pu-Chun Li, Qiao-Yan Zhu, Xiaolin Li, Wenyong Sun, Xiao-Jia Wang, Wen-Ming Cao","doi":"10.1002/2056-4538.70011","DOIUrl":"10.1002/2056-4538.70011","url":null,"abstract":"<p>Long-term survival varies among hormone receptor-positive (HR+) and human epidermal growth factor receptor 2-negative (HER2−) breast cancer patients and is seriously impaired by metastasis. Chromosomal instability (CIN) was one of the key drivers of breast cancer metastasis. Here we evaluate CIN and 10-year invasive disease-free survival (iDFS) and overall survival (OS) in HR+/HER2−– breast cancer. In this large-scale, multiple-site, retrospective study, 354 HR+/HER2− breast cancer patients were recruited. Of these, 204 patients were used for internal training, 70 for external validation, and 80 for cross-validation. All medical records were carefully reviewed to obtain the disease recurrence information. Formalin-fixed paraffin-embedded tissue samples were collected, followed by low-pass whole-genome sequencing with a median genome coverage of 1.86X using minimal 1 ng DNA input. CIN was then assessed using a customized bioinformatics workflow. Three or more instances of CIN per sample was defined as high CIN and the frequency was 42.2% (86/204) in the internal cohort. High CIN correlated significantly with increased lymph node metastasis, vascular invasion, progesterone receptor negative status, HER2 low, worse pathological type, and performed as an independent prognostic factor for HR+/− breast cancer. Patients with high CIN had shorter iDFS and OS than those with low CIN [10-year iDFS 11.1% versus 82.2%, hazard ratio (HR) = 11.12, <i>p</i> < 0.01; 10-year OS 45.7% versus 94.3%, HR = 14.17, <i>p</i> < 0.01]. These findings were validated in two external cohorts with 70 breast cancer patients. Moreover, high CIN could predict the prognosis more accurately than Adjuvant! Online score (10-year iDFS 11.1% versus 48.6%, HR = 2.71, <i>p</i> < 0.01). Cross-validation analysis found that high consistency (83.8%) was observed between CIN and MammaPrint score, while only 45% between CIN and Adjuvant! Online score. In conclusion, high CIN is an independent prognostic indicator for HR+/HER2− breast cancer with shorter iDFS and OS and holds promise for predicting recurrence and metastasis.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"10 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/2056-4538.70011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Merijn CF Mulders, Anna Vera D Verschuur, Quido G de Lussanet de la Sablonière, Eva Maria Roes, Christoph Geisenberger, Lodewijk AA Brosens, Wouter W de Herder, Marie-Louise F van Velthuysen, Johannes Hofland
{"title":"Clinicopathological and epigenetic differences between primary neuroendocrine tumors and neuroendocrine metastases in the ovary","authors":"Merijn CF Mulders, Anna Vera D Verschuur, Quido G de Lussanet de la Sablonière, Eva Maria Roes, Christoph Geisenberger, Lodewijk AA Brosens, Wouter W de Herder, Marie-Louise F van Velthuysen, Johannes Hofland","doi":"10.1002/2056-4538.70000","DOIUrl":"10.1002/2056-4538.70000","url":null,"abstract":"<p>Currently, the available literature provides insufficient support to differentiate between primary ovarian neuroendocrine tumors (PON) and neuroendocrine ovarian metastases (NOM) in patients. For this reason, patients with a well-differentiated ovarian neuroendocrine tumor (NET) were identified through electronic patient records and a nationwide search between 1991 and 2023. Clinical characteristics were collected from electronic patient files. This resulted in the inclusion of 71 patients with NOM and 17 patients with PON. Histologic material was stained for Ki67, SSTR2a, CDX2, PAX8, TTF1, SATB2, ISLET1, OTP, PDX1, and ARX. DNA methylation analysis was performed on a subset of cases. All PON were unilateral and nine were found within a teratoma (PON-T+). A total of 78% of NOM were bilateral, and none were associated with a teratoma. PON without teratomous components (PON-T−) displayed a similar insular growth pattern and immunohistochemistry as NOM (<i>p</i> > 0.05). When compared with PON-T+, PON-T− more frequently displayed ISLET1 positivity and were larger, and patients were older at diagnosis (<i>p</i> < 0.05). Unsupervised analysis of DNA methylation profiles from tumors of ovarian (<i>n</i> = 16), pancreatic (<i>n</i> = 22), ileal (<i>n</i> = 10), and rectal (<i>n</i> = 7) origin revealed that four of five PON-T− clustered together with NOM and ileal NET, whereas four of five PON-T+ grouped with rectum NET. In conclusion, unilateral ovarian NET within a teratoma should be treated as a PON. Ovarian NET localizations without teratomous components have a molecular profile analogous to midgut NET metastases. For these patients, a thorough review of imaging should be performed to identify a possible undetected midgut NET and a corresponding follow-up strategy may be recommended.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"10 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11544441/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142606935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katherine J Hewitt, Isabella C Wiest, Zunamys I Carrero, Laura Bejan, Thomas O Millner, Sebastian Brandner, Jakob Nikolas Kather
{"title":"Large language models as a diagnostic support tool in neuropathology","authors":"Katherine J Hewitt, Isabella C Wiest, Zunamys I Carrero, Laura Bejan, Thomas O Millner, Sebastian Brandner, Jakob Nikolas Kather","doi":"10.1002/2056-4538.70009","DOIUrl":"10.1002/2056-4538.70009","url":null,"abstract":"<p>The WHO guidelines for classifying central nervous system (CNS) tumours are changing considerably with each release. The classification of CNS tumours is uniquely complex among most other solid tumours as it incorporates not just morphology, but also genetic and epigenetic features. Keeping current with these changes across medical fields can be challenging, even for clinical specialists. Large language models (LLMs) have demonstrated their ability to parse and process complex medical text, but their utility in neuro-oncology has not been systematically tested. We hypothesised that LLMs can effectively diagnose neuro-oncology cases from free-text histopathology reports according to the latest WHO guidelines. To test this hypothesis, we evaluated the performance of ChatGPT-4o, Claude-3.5-sonnet, and Llama3 across 30 challenging neuropathology cases, which each presented a complex mix of morphological and genetic information relevant to the diagnosis. Furthermore, we integrated these models with the latest WHO guidelines through Retrieval-Augmented Generation (RAG) and again assessed their diagnostic accuracy. Our data show that LLMs equipped with RAG, but not without RAG, can accurately diagnose the neuropathological tumour subtype in 90% of the tested cases. This study lays the groundwork for a new generation of computational tools that can assist neuropathologists in their daily reporting practice.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"10 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11540532/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142590746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Homologous recombination deficiency score is an independent prognostic factor in esophageal squamous cell carcinoma","authors":"Yulu Wang, Bowen Ding, Yunlan Tao, Lingli Huang, Qian Zhu, Chengying Gao, Mingli Feng, Yuchen Han","doi":"10.1002/2056-4538.70007","DOIUrl":"10.1002/2056-4538.70007","url":null,"abstract":"<p>Homologous recombination deficiency (HRD) represents an impairment in the homologous recombination repair (HRR) pathway, crucial for repairing DNA double-strand breaks and contributing to genomic instability in cancer. The HRD score may be a more reliable biomarker than HRR-related gene mutations for identifying patients sensitive to poly(ADP-ribose) polymerase inhibitors. Despite its relevance in various cancers, the HRD score remains underexplored in esophageal squamous cell carcinoma (ESCC). We retrospectively analyzed HRD scores in 96 ESCC patients, examining correlations with clinical characteristics and survival outcomes, and validated our findings using the TCGA dataset. Genomic sequencing utilized a custom superHRD next-generation sequencing panel, and HRD scores were calculated from 54,000 single-nucleotide polymorphisms using Kruskal–Wallis rank-sum tests and two cut-off points for analysis. Higher HRD scores correlated with advanced tumor stages, recurrence, and mutations in <i>TP53</i> and <i>ABCB1</i>, while <i>APC</i> mutations were linked to lower HRD scores. Patients with high HRD scores had significantly shorter disease-free survival (<i>p</i> = 0.013) and a trend toward shorter overall survival (OS) (<i>p</i> = 0.005), particularly those not receiving adjuvant therapy. Conversely, HRD-high patients undergoing adjuvant therapy showed a trend toward longer OS (<i>p</i> = 0.015). Multivariate analysis identified HRD as an independent prognostic factor (hazard ratio = 2.814 for recurrence, <i>p</i> = 0.015). Validation with the TCGA dataset supported these findings. This study highlights the associations between HRD scores, clinical characteristics, and genomic mutations in ESCC, suggesting HRD as a potential prognostic biomarker. HRD assessment may aid in patient stratification and personalized treatment strategies, warranting further investigation to validate the therapeutic implications of HRD scores in ESCC.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"10 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/2056-4538.70007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikolas Stathonikos, Marc Aubreville, Sjoerd de Vries, Frauke Wilm, Christof A Bertram, Mitko Veta, Paul J van Diest
{"title":"Breast cancer survival prediction using an automated mitosis detection pipeline","authors":"Nikolas Stathonikos, Marc Aubreville, Sjoerd de Vries, Frauke Wilm, Christof A Bertram, Mitko Veta, Paul J van Diest","doi":"10.1002/2056-4538.70008","DOIUrl":"10.1002/2056-4538.70008","url":null,"abstract":"<p>Mitotic count (MC) is the most common measure to assess tumor proliferation in breast cancer patients and is highly predictive of patient outcomes. It is, however, subject to inter- and intraobserver variation and reproducibility challenges that may hamper its clinical utility. In past studies, artificial intelligence (AI)-supported MC has been shown to correlate well with traditional MC on glass slides. Considering the potential of AI to improve reproducibility of MC between pathologists, we undertook the next validation step by evaluating the prognostic value of a fully automatic method to detect and count mitoses on whole slide images using a deep learning model. The model was developed in the context of the Mitosis Domain Generalization Challenge 2021 (MIDOG21) grand challenge and was expanded by a novel automatic area selector method to find the optimal mitotic hotspot and calculate the MC per 2 mm<sup>2</sup>. We employed this method on a breast cancer cohort with long-term follow-up from the University Medical Centre Utrecht (<i>N</i> = 912) and compared predictive values for overall survival of AI-based MC and light-microscopic MC, previously assessed during routine diagnostics. The MIDOG21 model was prognostically comparable to the original MC from the pathology report in uni- and multivariate survival analysis. In conclusion, a fully automated MC AI algorithm was validated in a large cohort of breast cancer with regard to retained prognostic value compared with traditional light-microscopic MC.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"10 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/2056-4538.70008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142510889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joep MA Bogaerts, Miranda P Steenbeek, John-Melle Bokhorst, Majke HD van Bommel, Luca Abete, Francesca Addante, Mariel Brinkhuis, Alicja Chrzan, Fleur Cordier, Mojgan Devouassoux-Shisheboran, Juan Fernández-Pérez, Anna Fischer, C Blake Gilks, Angela Guerriero, Marta Jaconi, Tony G Kleijn, Loes Kooreman, Spencer Martin, Jakob Milla, Nadine Narducci, Chara Ntala, Vinita Parkash, Christophe de Pauw, Joseph T Rabban, Lucia Rijstenberg, Robert Rottscholl, Annette Staebler, Koen Van de Vijver, Gian Franco Zannoni, Monica van Zanten, AI-STIC Study Group, Joanne A de Hullu, Michiel Simons, Jeroen AWM van der Laak
{"title":"Assessing the impact of deep-learning assistance on the histopathological diagnosis of serous tubal intraepithelial carcinoma (STIC) in fallopian tubes","authors":"Joep MA Bogaerts, Miranda P Steenbeek, John-Melle Bokhorst, Majke HD van Bommel, Luca Abete, Francesca Addante, Mariel Brinkhuis, Alicja Chrzan, Fleur Cordier, Mojgan Devouassoux-Shisheboran, Juan Fernández-Pérez, Anna Fischer, C Blake Gilks, Angela Guerriero, Marta Jaconi, Tony G Kleijn, Loes Kooreman, Spencer Martin, Jakob Milla, Nadine Narducci, Chara Ntala, Vinita Parkash, Christophe de Pauw, Joseph T Rabban, Lucia Rijstenberg, Robert Rottscholl, Annette Staebler, Koen Van de Vijver, Gian Franco Zannoni, Monica van Zanten, AI-STIC Study Group, Joanne A de Hullu, Michiel Simons, Jeroen AWM van der Laak","doi":"10.1002/2056-4538.70006","DOIUrl":"10.1002/2056-4538.70006","url":null,"abstract":"<p>In recent years, it has become clear that artificial intelligence (AI) models can achieve high accuracy in specific pathology-related tasks. An example is our deep-learning model, designed to automatically detect serous tubal intraepithelial carcinoma (STIC), the precursor lesion to high-grade serous ovarian carcinoma, found in the fallopian tube. However, the standalone performance of a model is insufficient to determine its value in the diagnostic setting. To evaluate the impact of the use of this model on pathologists' performance, we set up a fully crossed multireader, multicase study, in which 26 participants, from 11 countries, reviewed 100 digitalized H&E-stained slides of fallopian tubes (30 cases/70 controls) with and without AI assistance, with a washout period between the sessions. We evaluated the effect of the deep-learning model on accuracy, slide review time and (subjectively perceived) diagnostic certainty, using mixed-models analysis. With AI assistance, we found a significant increase in accuracy (<i>p</i> < 0.01) whereby the average sensitivity increased from 82% to 93%. Further, there was a significant 44 s (32%) reduction in slide review time (<i>p</i> < 0.01). The level of certainty that the participants felt versus their own assessment also significantly increased, by 0.24 on a 10-point scale (<i>p</i> < 0.01). In conclusion, we found that, in a diverse group of pathologists and pathology residents, AI support resulted in a significant improvement in the accuracy of STIC diagnosis and was coupled with a substantial reduction in slide review time. This model has the potential to provide meaningful support to pathologists in the diagnosis of STIC, ultimately streamlining and optimizing the overall diagnostic process.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"10 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11496567/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142510888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A clinically feasible algorithm for the parallel detection of glioma-associated copy number variation markers based on shallow whole genome sequencing","authors":"Shuai Wu, Chenyu Ma, Jiawei Cai, Chenkang Yang, Xiaojia Liu, Chen Luo, Jingyi Yang, Zhang Xiong, Dandan Cao, Hong Chen","doi":"10.1002/2056-4538.70005","DOIUrl":"10.1002/2056-4538.70005","url":null,"abstract":"<p>Molecular features are incorporated into the integrated diagnostic system for adult diffuse gliomas. Of these, copy number variation (CNV) markers, including both arm-level (1p/19q codeletion, +7/−10 signature) and gene-level (<i>EGFR</i> gene amplification, <i>CDKN2A/B</i> homozygous deletion) changes, have revolutionized the diagnostic paradigm by updating the subtyping and grading schemes. Shallow whole genome sequencing (sWGS) has been widely used for CNV detection due to its cost-effectiveness and versatility. However, the parallel detection of glioma-associated CNV markers using sWGS has not been optimized in a clinical setting. Herein, we established a model-based approach to classify the CNV status of glioma-associated diagnostic markers with a single test. To enhance its clinical utility, we carried out hypothesis testing model-based analysis through the estimation of copy ratio fluctuation level, which was implemented individually and independently and, thus, avoided the necessity for normal controls. Besides, the customization of required minimal tumor fraction (TF) was evaluated and recommended for each glioma-associated marker to ensure robust classification. As a result, with 1× sequencing depth and 0.05 TF, arm-level CNVs could be reliably detected with at least 99.5% sensitivity and specificity. For <i>EGFR</i> gene amplification and <i>CDKN2A/B</i> homozygous deletion, the corresponding TF limits were 0.15 and 0.45 to ensure the evaluation metrics were both higher than 97%. Furthermore, we applied the algorithm to an independent glioma cohort and observed the expected sample distribution and prognostic stratification patterns. In conclusion, we provide a clinically applicable algorithm to classify the CNV status of glioma-associated markers in parallel.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"10 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458885/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142394351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Hyeong Park, June Hyuck Lim, Seonhwa Kim, Chul-Ho Kim, Jeong-Seok Choi, Jun Hyeok Lim, Lucia Kim, Jae Won Chang, Dongil Park, Myung-won Lee, Sup Kim, Il-Seok Park, Seung Hoon Han, Eun Shin, Jin Roh, Jaesung Heo
{"title":"Deep learning-based analysis of EGFR mutation prevalence in lung adenocarcinoma H&E whole slide images","authors":"Jun Hyeong Park, June Hyuck Lim, Seonhwa Kim, Chul-Ho Kim, Jeong-Seok Choi, Jun Hyeok Lim, Lucia Kim, Jae Won Chang, Dongil Park, Myung-won Lee, Sup Kim, Il-Seok Park, Seung Hoon Han, Eun Shin, Jin Roh, Jaesung Heo","doi":"10.1002/2056-4538.70004","DOIUrl":"10.1002/2056-4538.70004","url":null,"abstract":"<p><i>EGFR</i> mutations are a major prognostic factor in lung adenocarcinoma. However, current detection methods require sufficient samples and are costly. Deep learning is promising for mutation prediction in histopathological image analysis but has limitations in that it does not sufficiently reflect tumor heterogeneity and lacks interpretability. In this study, we developed a deep learning model to predict the presence of <i>EGFR</i> mutations by analyzing histopathological patterns in whole slide images (WSIs). We also introduced the <i>EGFR</i> mutation prevalence (EMP) score, which quantifies <i>EGFR</i> prevalence in WSIs based on patch-level predictions, and evaluated its interpretability and utility. Our model estimates the probability of EGFR prevalence in each patch by partitioning the WSI based on multiple-instance learning and predicts the presence of <i>EGFR</i> mutations at the slide level. We utilized a patch-masking scheduler training strategy to enable the model to learn various histopathological patterns of EGFR. This study included 868 WSI samples from lung adenocarcinoma patients collected from three medical institutions: Hallym University Medical Center, Inha University Hospital, and Chungnam National University Hospital. For the test dataset, 197 WSIs were collected from Ajou University Medical Center to evaluate the presence of <i>EGFR</i> mutations. Our model demonstrated prediction performance with an area under the receiver operating characteristic curve of 0.7680 (0.7607–0.7720) and an area under the precision-recall curve of 0.8391 (0.8326–0.8430). The EMP score showed Spearman correlation coefficients of 0.4705 (<i>p</i> = 0.0087) for p.L858R and 0.5918 (<i>p</i> = 0.0037) for exon 19 deletions in 64 samples subjected to next-generation sequencing analysis. Additionally, high EMP scores were associated with papillary and acinar patterns (<i>p</i> = 0.0038 and <i>p</i> = 0.0255, respectively), whereas low EMP scores were associated with solid patterns (<i>p</i> = 0.0001). These results validate the reliability of our model and suggest that it can provide crucial information for rapid screening and treatment plans.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"10 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11446692/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142367074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring prognostic biomarkers in pathological images of colorectal cancer patients via deep learning","authors":"Binshen Wei, Linqing Li, Yenan Feng, Sihan Liu, Peng Fu, Lin Tian","doi":"10.1002/2056-4538.70003","DOIUrl":"10.1002/2056-4538.70003","url":null,"abstract":"<p>Hematoxylin and eosin (H&E) whole slide images provide valuable information for predicting prognostic outcomes in colorectal cancer (CRC) patients. However, extracting prognostic indicators from pathological images is challenging due to the subtle complexities of phenotypic information. We trained a weakly supervised deep learning model on data from 640 CRC patients in the prostate, lung, colorectal, and ovarian (PLCO) cancer screening trial dataset and validated it using data from 522 CRC patients in the cancer genome atlas (TCGA) dataset. We created the colorectal cancer risk score (CRCRS) to assess patient prognosis, visualized the pathological phenotype of the risk score using Grad-CAM, and employed multiomics data from the TCGA CRC cohort to investigate the potential biological mechanisms underlying the risk score. The overall survival analysis revealed that the CRCRS served as an independent prognostic indicator for both the PLCO cohort (<i>p</i> < 0.001) and the TCGA cohort (<i>p</i> < 0.001), with its predictive efficacy remaining unaffected by the clinical staging system. Additionally, satisfactory chemotherapeutic benefits were observed in stage II/III CRC patients with high CRCRS but not in those with low CRCRS. A pathomics nomogram constructed by integrating the CRCRS with the tumor-node-metastasis (TNM) staging system enhanced prognostic prediction accuracy compared with using the TNM staging system alone. Noteworthy features of the risk score were identified, such as immature tumor mesenchyme, disorganized gland structures, small clusters of cancer cells associated with unfavorable prognosis, and infiltrating inflammatory cells associated with favorable prognosis. The TCGA multiomics data revealed potential correlations between the CRCRS and the activation of energy production and metabolic pathways, the tumor immune microenvironment, and genetic mutations in <i>APC</i>, <i>SMAD2</i>, <i>EEF1AKMT4</i>, <i>EPG5</i>, and <i>TANC1</i>. In summary, our deep learning algorithm identified the CRCRS as a prognostic indicator in CRC, providing a significant approach for prognostic risk stratification and tailoring precise treatment strategies for individual patients.</p>","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"10 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/2056-4538.70003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}