{"title":"Deep learning for fine-grained molecular-based colorectal cancer classification.","authors":"Junyu Bian, Yansong Li, Yamei Dang, Yonglin Chen","doi":"10.21037/tcr-2024-2348","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Colorectal cancer (CRC) is one of the most common malignancies globally and a major cause of cancer-related deaths. In the molecular diagnosis of CRC, microsatellite instability (MSI) status and mutations in genes such as <i>BRAF</i>, <i>KRAS</i>, and <i>NRAS</i> are important molecular markers. Traditional molecular detection methods are costly and time-consuming. Therefore, this study proposes a fine-grained classification method for CRC based on hematoxylin and eosin (H&E) stained tissue section images combined with deep learning (DL) technology, aiming to provide new insights into the molecular diagnosis of CRC.</p><p><strong>Methods: </strong>In this study, we first collected H&E-stained tissue section images of 383 CRC patients from The First Hospital of Lanzhou University (LZUFH) and constructed the LZUFH_CRC dataset. Then, we proposed a hybrid DL model combining Convolutional Neural Network (CNN) and Vision Transformer (ViT) for fine-grained classification tasks in CRC. The model consists of three parts: a feature extractor, an aggregator, and a classification head. A two-stage training strategy was adopted for model training. Finally, we evaluated the performance of the model on the LZUFH_CRC dataset and compared it with other methods.</p><p><strong>Results: </strong>The results showed that the proposed model achieved an overall accuracy (ACC) of 0.524 and area under the receiver operating characteristic curve (AUC) of 0.791 on the LZUFH_CRC dataset. Among them, the grouping names MSI and NRAS had better classification performance, with F1-scores of 0.724 and 0.514, respectively. Additionally, the study visualized the feature activation maps to show the regions of interest of the model for different input images, finding that the model paid more attention to the transitional areas between tumor and non-tumor regions and the mesenchymal areas of the tumor. Meanwhile, comparisons among different clinical characteristic groups showed that the model did not exhibit significant biases in terms of gender, age and tumor location.</p><p><strong>Conclusions: </strong>This study proposed a fine-grained classification method for CRC based on DL technology, which combines H&E-stained tissue section images with DL technologies such as CNN and ViT, providing new insights into the molecular diagnosis of CRC. Although the performance of the model needs further improvement, the results indicate that DL technology has potential in the molecular detection of CRC. In the future, the research team will continue to optimize the model to improve the ACC and efficiency of fine-grained classification in CRC.</p>","PeriodicalId":23216,"journal":{"name":"Translational cancer research","volume":"14 5","pages":"3035-3046"},"PeriodicalIF":1.5000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12169992/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Translational cancer research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/tcr-2024-2348","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/8 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Colorectal cancer (CRC) is one of the most common malignancies globally and a major cause of cancer-related deaths. In the molecular diagnosis of CRC, microsatellite instability (MSI) status and mutations in genes such as BRAF, KRAS, and NRAS are important molecular markers. Traditional molecular detection methods are costly and time-consuming. Therefore, this study proposes a fine-grained classification method for CRC based on hematoxylin and eosin (H&E) stained tissue section images combined with deep learning (DL) technology, aiming to provide new insights into the molecular diagnosis of CRC.
Methods: In this study, we first collected H&E-stained tissue section images of 383 CRC patients from The First Hospital of Lanzhou University (LZUFH) and constructed the LZUFH_CRC dataset. Then, we proposed a hybrid DL model combining Convolutional Neural Network (CNN) and Vision Transformer (ViT) for fine-grained classification tasks in CRC. The model consists of three parts: a feature extractor, an aggregator, and a classification head. A two-stage training strategy was adopted for model training. Finally, we evaluated the performance of the model on the LZUFH_CRC dataset and compared it with other methods.
Results: The results showed that the proposed model achieved an overall accuracy (ACC) of 0.524 and area under the receiver operating characteristic curve (AUC) of 0.791 on the LZUFH_CRC dataset. Among them, the grouping names MSI and NRAS had better classification performance, with F1-scores of 0.724 and 0.514, respectively. Additionally, the study visualized the feature activation maps to show the regions of interest of the model for different input images, finding that the model paid more attention to the transitional areas between tumor and non-tumor regions and the mesenchymal areas of the tumor. Meanwhile, comparisons among different clinical characteristic groups showed that the model did not exhibit significant biases in terms of gender, age and tumor location.
Conclusions: This study proposed a fine-grained classification method for CRC based on DL technology, which combines H&E-stained tissue section images with DL technologies such as CNN and ViT, providing new insights into the molecular diagnosis of CRC. Although the performance of the model needs further improvement, the results indicate that DL technology has potential in the molecular detection of CRC. In the future, the research team will continue to optimize the model to improve the ACC and efficiency of fine-grained classification in CRC.
期刊介绍:
Translational Cancer Research (Transl Cancer Res TCR; Print ISSN: 2218-676X; Online ISSN 2219-6803; http://tcr.amegroups.com/) is an Open Access, peer-reviewed journal, indexed in Science Citation Index Expanded (SCIE). TCR publishes laboratory studies of novel therapeutic interventions as well as clinical trials which evaluate new treatment paradigms for cancer; results of novel research investigations which bridge the laboratory and clinical settings including risk assessment, cellular and molecular characterization, prevention, detection, diagnosis and treatment of human cancers with the overall goal of improving the clinical care of cancer patients. The focus of TCR is original, peer-reviewed, science-based research that successfully advances clinical medicine toward the goal of improving patients'' quality of life. The editors and an international advisory group of scientists and clinician-scientists as well as other experts will hold TCR articles to the high-quality standards. We accept Original Articles as well as Review Articles, Editorials and Brief Articles.