{"title":"scRDiT:通过扩散变换器和加速采样生成单细胞 RNA-seq 数据","authors":"Shengze Dong, Zhuorui Cui, Ding Liu, Jinzhi Lei","doi":"arxiv-2404.06153","DOIUrl":null,"url":null,"abstract":"Motivation: Single-cell RNA sequencing (scRNA-seq) is a groundbreaking\ntechnology extensively utilized in biological research, facilitating the\nexamination of gene expression at the individual cell level within a given\ntissue sample. While numerous tools have been developed for scRNA-seq data\nanalysis, the challenge persists in capturing the distinct features of such\ndata and replicating virtual datasets that share analogous statistical\nproperties. Results: Our study introduces a generative approach termed\nscRNA-seq Diffusion Transformer (scRDiT). This method generates virtual\nscRNA-seq data by leveraging a real dataset. The method is a neural network\nconstructed based on Denoising Diffusion Probabilistic Models (DDPMs) and\nDiffusion Transformers (DiTs). This involves subjecting Gaussian noises to the\nreal dataset through iterative noise-adding steps and ultimately restoring the\nnoises to form scRNA-seq samples. This scheme allows us to learn data features\nfrom actual scRNA-seq samples during model training. Our experiments, conducted\non two distinct scRNA-seq datasets, demonstrate superior performance.\nAdditionally, the model sampling process is expedited by incorporating\nDenoising Diffusion Implicit Models (DDIM). scRDiT presents a unified\nmethodology empowering users to train neural network models with their unique\nscRNA-seq datasets, enabling the generation of numerous high-quality scRNA-seq\nsamples. Availability and implementation: https://github.com/DongShengze/scRDiT","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"scRDiT: Generating single-cell RNA-seq data by diffusion transformers and accelerating sampling\",\"authors\":\"Shengze Dong, Zhuorui Cui, Ding Liu, Jinzhi Lei\",\"doi\":\"arxiv-2404.06153\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Motivation: Single-cell RNA sequencing (scRNA-seq) is a groundbreaking\\ntechnology extensively utilized in biological research, facilitating the\\nexamination of gene expression at the individual cell level within a given\\ntissue sample. While numerous tools have been developed for scRNA-seq data\\nanalysis, the challenge persists in capturing the distinct features of such\\ndata and replicating virtual datasets that share analogous statistical\\nproperties. Results: Our study introduces a generative approach termed\\nscRNA-seq Diffusion Transformer (scRDiT). This method generates virtual\\nscRNA-seq data by leveraging a real dataset. The method is a neural network\\nconstructed based on Denoising Diffusion Probabilistic Models (DDPMs) and\\nDiffusion Transformers (DiTs). This involves subjecting Gaussian noises to the\\nreal dataset through iterative noise-adding steps and ultimately restoring the\\nnoises to form scRNA-seq samples. This scheme allows us to learn data features\\nfrom actual scRNA-seq samples during model training. Our experiments, conducted\\non two distinct scRNA-seq datasets, demonstrate superior performance.\\nAdditionally, the model sampling process is expedited by incorporating\\nDenoising Diffusion Implicit Models (DDIM). scRDiT presents a unified\\nmethodology empowering users to train neural network models with their unique\\nscRNA-seq datasets, enabling the generation of numerous high-quality scRNA-seq\\nsamples. Availability and implementation: https://github.com/DongShengze/scRDiT\",\"PeriodicalId\":501070,\"journal\":{\"name\":\"arXiv - QuanBio - Genomics\",\"volume\":\"25 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Genomics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2404.06153\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2404.06153","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
scRDiT: Generating single-cell RNA-seq data by diffusion transformers and accelerating sampling
Motivation: Single-cell RNA sequencing (scRNA-seq) is a groundbreaking
technology extensively utilized in biological research, facilitating the
examination of gene expression at the individual cell level within a given
tissue sample. While numerous tools have been developed for scRNA-seq data
analysis, the challenge persists in capturing the distinct features of such
data and replicating virtual datasets that share analogous statistical
properties. Results: Our study introduces a generative approach termed
scRNA-seq Diffusion Transformer (scRDiT). This method generates virtual
scRNA-seq data by leveraging a real dataset. The method is a neural network
constructed based on Denoising Diffusion Probabilistic Models (DDPMs) and
Diffusion Transformers (DiTs). This involves subjecting Gaussian noises to the
real dataset through iterative noise-adding steps and ultimately restoring the
noises to form scRNA-seq samples. This scheme allows us to learn data features
from actual scRNA-seq samples during model training. Our experiments, conducted
on two distinct scRNA-seq datasets, demonstrate superior performance.
Additionally, the model sampling process is expedited by incorporating
Denoising Diffusion Implicit Models (DDIM). scRDiT presents a unified
methodology empowering users to train neural network models with their unique
scRNA-seq datasets, enabling the generation of numerous high-quality scRNA-seq
samples. Availability and implementation: https://github.com/DongShengze/scRDiT