{"title":"Gene Swin transformer: new deep learning method for colorectal cancer prognosis using transcriptomic data.","authors":"Yangyang Wang, Xinyu Yue, Shenghan Lou, Peinan Feng, Binbin Cui, Yanlong Liu","doi":"10.1093/bib/bbaf275","DOIUrl":null,"url":null,"abstract":"<p><p>Transcriptome sequencing has become essential in clinical tumor research, providing in-depth insights into the biology and functionality of tumor cells. However, the vast amount of data generated and the complex relationships between gene expressions make it challenging to effectively identify clinically relevant information. In this study, we developed a method called Gene Swin Transformer to address these challenges. This approach converts transcriptomic data into Synthetic Image Elements (SIEs). We utilized data from 12 datasets, including GSE17536-GSE103479 datasets (n = 1771) and The Cancer Genome Atlas (n = 459), to generate SIEs. These elements were then classified based on survival time using deep learning algorithms to predict colorectal cancer prognosis and build a reliable prognostic model. We trained and evaluated four deep learning models-BeiT, ResNet, Swin Transformer, and ViT Transformer-and compared their performance. The enhanced Swin-T model outperformed the other models, achieving weighted precision, recall, and F1 scores of 0.708, 0.692, and 0.705, respectively, along with area under the curve values of 80.2%, 72.7%, and 76.9% across three datasets. This model demonstrated the strongest prognostic prediction capabilities among those evaluated. Additionally, the PEX10 gene was identified as a key prognostic marker through both visual attention matrix analysis and bioinformatics methods. Our study demonstrates that the Gene Swin model effectively transforms Ribonucleic Acid (RNA) sequencing data into SIEs, enabling prognosis prediction through attention-based algorithms. This approach supports the development of a data-driven, unified, and automated model, offering a robust tool for classification and prediction tasks using RNA sequencing data. This advancement presents a novel clinical strategy for cancer treatment and prognosis forecasting.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12165829/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf275","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Transcriptome sequencing has become essential in clinical tumor research, providing in-depth insights into the biology and functionality of tumor cells. However, the vast amount of data generated and the complex relationships between gene expressions make it challenging to effectively identify clinically relevant information. In this study, we developed a method called Gene Swin Transformer to address these challenges. This approach converts transcriptomic data into Synthetic Image Elements (SIEs). We utilized data from 12 datasets, including GSE17536-GSE103479 datasets (n = 1771) and The Cancer Genome Atlas (n = 459), to generate SIEs. These elements were then classified based on survival time using deep learning algorithms to predict colorectal cancer prognosis and build a reliable prognostic model. We trained and evaluated four deep learning models-BeiT, ResNet, Swin Transformer, and ViT Transformer-and compared their performance. The enhanced Swin-T model outperformed the other models, achieving weighted precision, recall, and F1 scores of 0.708, 0.692, and 0.705, respectively, along with area under the curve values of 80.2%, 72.7%, and 76.9% across three datasets. This model demonstrated the strongest prognostic prediction capabilities among those evaluated. Additionally, the PEX10 gene was identified as a key prognostic marker through both visual attention matrix analysis and bioinformatics methods. Our study demonstrates that the Gene Swin model effectively transforms Ribonucleic Acid (RNA) sequencing data into SIEs, enabling prognosis prediction through attention-based algorithms. This approach supports the development of a data-driven, unified, and automated model, offering a robust tool for classification and prediction tasks using RNA sequencing data. This advancement presents a novel clinical strategy for cancer treatment and prognosis forecasting.
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.