{"title":"SpaDiT:利用 scRNA-seq 进行空间基因表达预测的扩散变换器。","authors":"Xiaoyu Li, Fangfang Zhu, Wenwen Min","doi":"10.1093/bib/bbae571","DOIUrl":null,"url":null,"abstract":"<p><p>The rapid development of spatially resolved transcriptomics (SRT) technologies has provided unprecedented opportunities for exploring the structure of specific organs or tissues. However, these techniques (such as image-based SRT) can achieve single-cell resolution, but can only capture the expression levels of tens to hundreds of genes. Such spatial transcriptomics (ST) data, carrying a large number of undetected genes, have limited its application value. To address the challenge, we develop SpaDiT, a deep learning framework for spatial reconstruction and gene expression prediction using scRNA-seq data. SpaDiT employs scRNA-seq data as an a priori condition and utilizes shared genes between ST and scRNA-seq data as latent representations to construct inputs, thereby facilitating the accurate prediction of gene expression in ST data. SpaDiT enhances the accuracy of spatial gene expression predictions over a variety of spatial transcriptomics datasets. We have demonstrated the effectiveness of SpaDiT by conducting extensive experiments on both seq-based and image-based ST data. We compared SpaDiT with eight highly effective baseline methods and found that our proposed method achieved an 8%-12% improvement in performance across multiple metrics. Source code and all datasets used in this paper are available at https://github.com/wenwenmin/SpaDiT and https://zenodo.org/records/12792074.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11541600/pdf/","citationCount":"0","resultStr":"{\"title\":\"SpaDiT: diffusion transformer for spatial gene expression prediction using scRNA-seq.\",\"authors\":\"Xiaoyu Li, Fangfang Zhu, Wenwen Min\",\"doi\":\"10.1093/bib/bbae571\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The rapid development of spatially resolved transcriptomics (SRT) technologies has provided unprecedented opportunities for exploring the structure of specific organs or tissues. However, these techniques (such as image-based SRT) can achieve single-cell resolution, but can only capture the expression levels of tens to hundreds of genes. Such spatial transcriptomics (ST) data, carrying a large number of undetected genes, have limited its application value. To address the challenge, we develop SpaDiT, a deep learning framework for spatial reconstruction and gene expression prediction using scRNA-seq data. SpaDiT employs scRNA-seq data as an a priori condition and utilizes shared genes between ST and scRNA-seq data as latent representations to construct inputs, thereby facilitating the accurate prediction of gene expression in ST data. SpaDiT enhances the accuracy of spatial gene expression predictions over a variety of spatial transcriptomics datasets. We have demonstrated the effectiveness of SpaDiT by conducting extensive experiments on both seq-based and image-based ST data. We compared SpaDiT with eight highly effective baseline methods and found that our proposed method achieved an 8%-12% improvement in performance across multiple metrics. Source code and all datasets used in this paper are available at https://github.com/wenwenmin/SpaDiT and https://zenodo.org/records/12792074.</p>\",\"PeriodicalId\":9209,\"journal\":{\"name\":\"Briefings in bioinformatics\",\"volume\":\"25 6\",\"pages\":\"\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2024-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11541600/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Briefings in bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/bib/bbae571\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbae571","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
摘要
空间分辨转录组学(SRT)技术的快速发展为探索特定器官或组织的结构提供了前所未有的机会。然而,这些技术(如基于图像的 SRT)可以达到单细胞分辨率,但只能捕捉几十到几百个基因的表达水平。这种空间转录组学(ST)数据携带大量未检测到的基因,限制了其应用价值。为了应对这一挑战,我们开发了一种利用 scRNA-seq 数据进行空间重建和基因表达预测的深度学习框架 SpaDiT。SpaDiT 将 scRNA-seq 数据作为先验条件,利用 ST 和 scRNA-seq 数据之间的共享基因作为潜在表征来构建输入,从而促进 ST 数据中基因表达的准确预测。SpaDiT 提高了对各种空间转录组学数据集进行空间基因表达预测的准确性。我们在基于序列和图像的 ST 数据上进行了大量实验,证明了 SpaDiT 的有效性。我们将 SpaDiT 与八种高效的基线方法进行了比较,发现我们提出的方法在多个指标上的性能提高了 8%-12%。本文使用的源代码和所有数据集可在 https://github.com/wenwenmin/SpaDiT 和 https://zenodo.org/records/12792074 上获取。
SpaDiT: diffusion transformer for spatial gene expression prediction using scRNA-seq.
The rapid development of spatially resolved transcriptomics (SRT) technologies has provided unprecedented opportunities for exploring the structure of specific organs or tissues. However, these techniques (such as image-based SRT) can achieve single-cell resolution, but can only capture the expression levels of tens to hundreds of genes. Such spatial transcriptomics (ST) data, carrying a large number of undetected genes, have limited its application value. To address the challenge, we develop SpaDiT, a deep learning framework for spatial reconstruction and gene expression prediction using scRNA-seq data. SpaDiT employs scRNA-seq data as an a priori condition and utilizes shared genes between ST and scRNA-seq data as latent representations to construct inputs, thereby facilitating the accurate prediction of gene expression in ST data. SpaDiT enhances the accuracy of spatial gene expression predictions over a variety of spatial transcriptomics datasets. We have demonstrated the effectiveness of SpaDiT by conducting extensive experiments on both seq-based and image-based ST data. We compared SpaDiT with eight highly effective baseline methods and found that our proposed method achieved an 8%-12% improvement in performance across multiple metrics. Source code and all datasets used in this paper are available at https://github.com/wenwenmin/SpaDiT and https://zenodo.org/records/12792074.
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.