{"title":"Single-cell RNA-seq data augmentation using generative Fourier transformer.","authors":"Nima Nouri","doi":"10.1038/s42003-025-07552-8","DOIUrl":null,"url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) provides a powerful tool for dissecting cellular complexity and heterogeneity. However, its full potential to achieve statistically reliable conclusions is often constrained by the limited number of cells profiled, particularly in studies of rare diseases, specialized tissues, and uncommon cell types. Deep learning-based generative models (GMs) designed to address data scarcity often face similar limitations due to their reliance on pre-training or fine-tuning, inadvertently perpetuating a cycle of data inadequacy. To overcome this obstacle, we introduce scGFT (single-cell Generative Fourier Transformer), a train-free, cell-centric GM adept at synthesizing single cells that exhibit natural gene expression profiles present within authentic datasets. Using both simulated and experimental data, we demonstrate the mathematical rigor of scGFT and validate its ability to synthesize cells that preserve the intrinsic characteristics delineated in scRNA-seq data. Moreover, comparisons of scGFT with leading neural network-based GMs highlight its superior performance, driven by its analytical mechanism. By streamlining single-cell data augmentation, scGFT offers a scalable solution to mitigate data scarcity in cell-targeted research.</p>","PeriodicalId":10552,"journal":{"name":"Communications Biology","volume":"8 1","pages":"113"},"PeriodicalIF":5.2000,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11754799/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s42003-025-07552-8","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Single-cell RNA sequencing (scRNA-seq) provides a powerful tool for dissecting cellular complexity and heterogeneity. However, its full potential to achieve statistically reliable conclusions is often constrained by the limited number of cells profiled, particularly in studies of rare diseases, specialized tissues, and uncommon cell types. Deep learning-based generative models (GMs) designed to address data scarcity often face similar limitations due to their reliance on pre-training or fine-tuning, inadvertently perpetuating a cycle of data inadequacy. To overcome this obstacle, we introduce scGFT (single-cell Generative Fourier Transformer), a train-free, cell-centric GM adept at synthesizing single cells that exhibit natural gene expression profiles present within authentic datasets. Using both simulated and experimental data, we demonstrate the mathematical rigor of scGFT and validate its ability to synthesize cells that preserve the intrinsic characteristics delineated in scRNA-seq data. Moreover, comparisons of scGFT with leading neural network-based GMs highlight its superior performance, driven by its analytical mechanism. By streamlining single-cell data augmentation, scGFT offers a scalable solution to mitigate data scarcity in cell-targeted research.
期刊介绍:
Communications Biology is an open access journal from Nature Research publishing high-quality research, reviews and commentary in all areas of the biological sciences. Research papers published by the journal represent significant advances bringing new biological insight to a specialized area of research.