{"title":"Biology-driven insights into the power of single-cell foundation models","authors":"Jialu Wu, Qing Ye, Yilin Wang, Renling Hu, Yiheng Zhu, Mingze Yin, Tianyue Wang, Jike Wang, Chang-Yu Hsieh, Tingjun Hou","doi":"10.1186/s13059-025-03781-6","DOIUrl":"https://doi.org/10.1186/s13059-025-03781-6","url":null,"abstract":"Single-cell foundation models (scFMs) have emerged as powerful tools for integrating heterogeneous datasets and exploring biological systems. Despite high expectations, their ability to extract unique biological insights beyond standard methods and their advantages over traditional approaches in specific tasks remain unclear. Here, we present a comprehensive benchmark study of six scFMs against well-established baselines under realistic conditions, encompassing two gene-level and four cell-level tasks. Pre-clinical batch integration and cell type annotation are evaluated across five datasets with diverse biological conditions, while clinically relevant tasks, such as cancer cell identification and drug sensitivity prediction, are assessed across seven cancer types and four drugs. Model performance is evaluated using 12 metrics spanning unsupervised, supervised, and knowledge-based approaches, including scGraph-OntoRWR, a novel metric designed to uncover intrinsic knowledge encoded by scFMs. We provide holistic rankings from dataset-specific to general performance to guide model selection. Our findings reveal that scFMs are robust and versatile tools for diverse applications while simpler machine learning models are more adept at efficiently adapting to specific datasets, particularly under resource constraints. Notably, no single scFM consistently outperforms others across all tasks, emphasizing the need for tailored model selection based on factors such as dataset size, task complexity, biological interpretability, and computational resources. This benchmark introduces novel evaluation perspectives, identifying the strengths and limitations of current scFMs, and paves the way for their effective application in biological and clinical research, including cell atlas construction, tumor microenvironment studies, and treatment decision-making.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"99 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145209970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"xCell 2.0: robust algorithm for cell type proportion estimation predicts response to immune checkpoint blockade","authors":"Almog Angel, Loai Naom, Shir Nabet-Levy, Dvir Aran","doi":"10.1186/s13059-025-03784-3","DOIUrl":"https://doi.org/10.1186/s13059-025-03784-3","url":null,"abstract":"Accurate estimation of cell type proportions from bulk gene expression data is essential for understanding the cellular heterogeneity underlying complex tissues and diseases. Here, we introduce xCell 2.0, an advanced version of the xCell algorithm, featuring a training function that permits the utilization of any reference dataset. xCell 2.0 generates cell type gene signatures using an improved methodology, including automated handling of cell type dependencies and more robust signature generation. We benchmark xCell 2.0 against eleven popular deconvolution methods using nine human and mouse reference sets and 26 validation datasets, encompassing 1711 samples and 67 cell types. Additionally, we validate xCell 2.0 using the independent Deconvolution DREAM Challenge dataset. xCell 2.0 outperforms all other tested methods across distinct reference datasets, demonstrating superior accuracy and consistency across diverse biological contexts. xCell 2.0 also shows the best performance in minimizing spillover effects between related cell types. In a test example of pan-cancer immune cell checkpoint blockage response prediction, xCell 2.0-derived TME features significantly improve prediction accuracy compared to models using only cancer type and treatment information, and outperformed other deconvolution methods and established prediction scores. xCell 2.0 is a versatile and robust tool for cell type deconvolution that maintains high performance across various reference types and biological contexts. It is available both via a locally hosted web application and as a Bioconductor-compatible package, equipped with a large collection of pre-trained cell type signatures for human and mouse research.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"5 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145209969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2025-10-03DOI: 10.1186/s13059-025-03776-3
Zeyuan Johnson Chen, Elior Rahmani, Eran Halperin
{"title":"Unico: a unified model for cell-type resolution genomics from heterogeneous omics data","authors":"Zeyuan Johnson Chen, Elior Rahmani, Eran Halperin","doi":"10.1186/s13059-025-03776-3","DOIUrl":"https://doi.org/10.1186/s13059-025-03776-3","url":null,"abstract":"Most population-scale genomic datasets collected to date consist of “bulk” samples obtained from heterogeneous tissues, reflecting mixtures of different cell types. We introduce Unico, a Unified cross-omics computational method designed to deconvolve standard two-dimensional bulk matrices (samples by features) into three-dimensional tensors (samples by features by cell types). Unico is the first principled model-based deconvolution method that is theoretically justified for any tissue-level genomic data. By deconvolving bulk gene expression and DNA methylation datasets, we demonstrate Unico’s superior performance compared to existing methods, enhancing the ability to conduct powerful, large-scale genomic studies at cell-type resolution.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"41 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145209991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2025-10-03DOI: 10.1186/s13059-025-03802-4
Mitchell Conery, James A. Pippin, Yadav Wagley, Khanh Trang, Matthew C. Pahl, David A. Villani, Lacey J. Favazzo, Cheryl L. Ackert-Bicknell, Michael J. Zuscik, Eugene Katsevich, Andrew D. Wells, Babette S. Zemel, Benjamin F. Voight, Kurt D. Hankenson, Alessandra Chesi, Struan F. A. Grant
{"title":"GWAS-informed data integration and non-coding CRISPRi screen illuminate genetic etiology of bone mineral density","authors":"Mitchell Conery, James A. Pippin, Yadav Wagley, Khanh Trang, Matthew C. Pahl, David A. Villani, Lacey J. Favazzo, Cheryl L. Ackert-Bicknell, Michael J. Zuscik, Eugene Katsevich, Andrew D. Wells, Babette S. Zemel, Benjamin F. Voight, Kurt D. Hankenson, Alessandra Chesi, Struan F. A. Grant","doi":"10.1186/s13059-025-03802-4","DOIUrl":"https://doi.org/10.1186/s13059-025-03802-4","url":null,"abstract":"Over 1100 independent signals have been identified with genome-wide association studies (GWAS) for bone mineral density (BMD), a key risk factor for mortality-increasing fragility fractures; however, the effector gene(s) for most remain unknown. We execute a CRISPRi screen in human fetal osteoblasts (hFOBs) with single-cell RNA-seq read-out for 89 non-coding elements predicted to regulate osteoblast gene expression at BMD GWAS loci. The BMD relevance of hFOBs is supported by heritability enrichment from stratified LD-score regression involving 98 cell types grouped into 15 tissues. Twenty-three genes show perturbation in the screen, with four (ARID5B, CC2D1B, EIF4G2, and NCOA3) exhibiting consistent effects upon siRNA knockdown on three measures of osteoblast maturation and mineralization. Lastly, additional heritability enrichments, genetic correlations, and multi-trait fine-mapping unexpectedly reveal that many BMD GWAS signals are pleiotropic and likely mediate their effects via non-bone tissues. Our results provide a roadmap for how single-cell CRISPRi screens may be applied to the challenging task of resolving effector gene identities at all BMD GWAS loci. Extending our CRISPRi screening approach to other tissues could play a key role in fully elucidating the etiology of BMD.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"220 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145209971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2025-10-03DOI: 10.1186/s13059-025-03815-z
Zihan Dong, Wei Jiang, Jiangnan Shen, Hongyu Li, Yuhan Xie, Andrew T. DeWan, Hongyu Zhao
{"title":"Incorporating additive genetic effects and linkage disequilibrium information to discover gene-environment interactions using BV-LDER-GE","authors":"Zihan Dong, Wei Jiang, Jiangnan Shen, Hongyu Li, Yuhan Xie, Andrew T. DeWan, Hongyu Zhao","doi":"10.1186/s13059-025-03815-z","DOIUrl":"https://doi.org/10.1186/s13059-025-03815-z","url":null,"abstract":"Uncovering environmental factors interacting with genetic factors to influence complex traits is important in genetic epidemiology and disease etiology. We introduce BiVariate Linkage-Disequilibrium Eigenvalue Regression for Gene-Environment interactions (BV-LDER-GE), a statistical method that detects the overall contributions of G × E interactions in the genome using summary statistics of complex traits. In comparison to existing methods which either ignore correlations with additive effects or use partial information of linkage disequilibrium (LD), BV-LDER-GE harnesses correlations with additive genetic effects and full LD information to enhance the statistical power to detect genome-scale G × E interactions.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"157 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2025-10-02DOI: 10.1186/s13059-025-03765-6
Penghui Huang, Manqi Cai, Chris McKennan, Jiebiao Wang
{"title":"BLEND: probabilistic cellular deconvolution with individualized single-cell reference integration","authors":"Penghui Huang, Manqi Cai, Chris McKennan, Jiebiao Wang","doi":"10.1186/s13059-025-03765-6","DOIUrl":"https://doi.org/10.1186/s13059-025-03765-6","url":null,"abstract":"Cellular deconvolution estimates cell-type fractions from bulk transcriptomic data, but current methods often overlook cell type-specific expression varying across samples, discrepancies between bulk and single-cell data, or lack guidance on reference data selection and integration. Therefore, we present BLEND, a hierarchical Bayesian method that leverages multiple single-cell reference datasets to perform cellular deconvolution. BLEND estimates cellular fractions accurately by learning the most suitable reference for each bulk sample, accounting for the aforementioned issues. BLEND outperforms state-of-the-art methods in comprehensive benchmarking studies using human brain cortex data and provides reliable insights into Alzheimer’s disease progression.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"114 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2025-10-02DOI: 10.1186/s13059-025-03787-0
Nazanin Farahi, Tamas Lazar, Peter Tompa, Bálint Mészáros, Rita Pancsa
{"title":"Phase-separating fusion proteins drive cancer by upsetting transcription regulation","authors":"Nazanin Farahi, Tamas Lazar, Peter Tompa, Bálint Mészáros, Rita Pancsa","doi":"10.1186/s13059-025-03787-0","DOIUrl":"https://doi.org/10.1186/s13059-025-03787-0","url":null,"abstract":"Numerous cellular processes rely on biomolecular condensates formed through liquid–liquid phase separation (LLPS). Recently, it has become evident that somatic mutations can interfere with or over-activate the formation of phase-separated condensates. Here, we set out to systematically study the connection between cancer and biological condensation, specifically mapping the extent to which LLPS is affected in cancer and understanding the molecular pathomechanisms and therapeutic consequences of mutations affecting LLPS scaffolds. We identify both known and novel combinations of molecular functions that are specific to oncogenic fusion proteins and thus have a high potential for driving tumorigenesis. Protein regions driving condensate formation show an increased association with DNA- or chromatin-binding domains of transcription regulators within oncogenic fusion proteins, indicating a common molecular mechanism underlying several soft tissue sarcomas and hematologic malignancies where phase-separation-prone oncogenic fusion proteins form abnormal condensates along the DNA and thereby dysregulate gene expression programs. We find that proteins initiating LLPS are frequently implicated in somatic cancers, even surpassing their involvement in neurodegeneration. Our data shows that cancer-driving LLPS scaffolds tend to be potent oncogenes, giving rise to dominant phenotypes and lacking targeting options by current FDA-approved drugs. Finding the currently missing drugs to shut down oncogenic fusion proteins, to disrupt the condensation enabled by them, and to offset their downstream effects could provide cancer drugs widely applicable to diverse cancer incidences previously defying standard treatments.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"30 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2025-10-02DOI: 10.1186/s13059-025-03791-4
Sohyun Bang, Xuan Zhang, Jason Gregory, Ziliang Luo, Zongliang Chen, Mark A. A. Minow, Mary Galli, Andrea Gallavotti, Robert J. Schmitz
{"title":"WUSCHEL-dependent chromatin regulation in maize inflorescence development at single-cell resolution","authors":"Sohyun Bang, Xuan Zhang, Jason Gregory, Ziliang Luo, Zongliang Chen, Mark A. A. Minow, Mary Galli, Andrea Gallavotti, Robert J. Schmitz","doi":"10.1186/s13059-025-03791-4","DOIUrl":"https://doi.org/10.1186/s13059-025-03791-4","url":null,"abstract":"WUSCHEL (WUS) is a homeodomain transcription factor vital for stem cell proliferation in plant meristems. In maize, ZmWUS1 is expressed in the inflorescence meristem, including the central zone reservoir of stem cells. ZmWUS1 overexpression in the Barren inflorescence3 (Bif3) mutant perturbs inflorescence development due to stem cell over-proliferation. Single-cell Assay for Transposase Accessible Chromatin sequencing (scATAC-seq) shows that Bif3 alters central zone chromatin accessibility compared to normal inflorescence meristems. The CAATAATGC motif, a known homeodomain recognition site, is enriched within regions with increased chromatin accessibility in Bif3, suggesting ZmWUS1 could function as a transcriptional activator in the central zone. This motif differs from the TGAATGAA motif identified by DNA Affinity Purification sequencing (DAP-seq) of ZmWUS1, which showed low enrichment in the central zone. Conversely, regions with decreased chromatin accessibility in Bif3 are instead adjacent to AUXIN RESPONSE FACTOR genes, suggesting possible reduced auxin signaling in the Bif3 central zone. This study characterized how Bif3 overexpression of ZmWUS1 influences chromatin accessibility in the central zone, reducing auxin signaling, while raising questions about differential ZmWUS1 motif usage in distinct cellular contexts.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"72 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2025-10-01DOI: 10.1186/s13059-025-03799-w
Clare E. Holleley, Erin E. Hahn
{"title":"The rise of historical epigenomics and temporal analysis of gene regulation","authors":"Clare E. Holleley, Erin E. Hahn","doi":"10.1186/s13059-025-03799-w","DOIUrl":"https://doi.org/10.1186/s13059-025-03799-w","url":null,"abstract":"Complex diseases driven by gene-environment interactions impose a heavy burden on human and animal health. Addressing these challenges requires innovative research. The emerging field of historical epigenomics offers a promising opportunity to link genotypes with phenotypes using preserved biological material. New methods such as historical chromatin profiling in museum specimens provide valuable insights into vertebrate genome regulation. Building on successful work with formalin-fixed paraffin-embedded (FFPE) samples, we expect growing interest in using historical specimens for biomedical, evolutionary, and ecological research. Applied to historical collections, these tools can provide critical baselines for understanding modern diseases, environmental stressors, and human adaptation.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"15 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145195150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2025-10-01DOI: 10.1186/s13059-025-03793-2
Yaliang Shi, Bo Li, Yuanfen Gao, Xiaohan Wang, Yang Liu, Xiang Lu, Hao Lin, Wei Li, Dili Lai, Ming Hao, Jia Gao, Kaixuan Zhang, Dengcai Liu, Sun-Hee Woo, Muriel Quinet, Alisdair R. Fernie, Xu Liu, Yuqi He, Meiliang Zhou
{"title":"Phylogenomics provides comprehensive insights into the evolutionary relationships among cultivated buckwheat species","authors":"Yaliang Shi, Bo Li, Yuanfen Gao, Xiaohan Wang, Yang Liu, Xiang Lu, Hao Lin, Wei Li, Dili Lai, Ming Hao, Jia Gao, Kaixuan Zhang, Dengcai Liu, Sun-Hee Woo, Muriel Quinet, Alisdair R. Fernie, Xu Liu, Yuqi He, Meiliang Zhou","doi":"10.1186/s13059-025-03793-2","DOIUrl":"https://doi.org/10.1186/s13059-025-03793-2","url":null,"abstract":"Buckwheat belongs to the family Polygonaceae and genus Fagopyrum, which is characterized by high flavonoid content, short growth period, and strong environmental adaptability. Buckwheat has three cultivated species, including the annual food crops common buckwheat (Fagopyrum esculentum) and Tartary buckwheat (Fagopyrum tataricum), and the perennial traditional herbal medicine golden buckwheat (Fagopyrum cymosum). However, the unclear phylogenetic relationships among these three species based on genomic data limit buckwheat interspecific hybridization and genetic improvement. Despite their enormous differences in morphology and genome, we confirm the closet relationship between Fagopyrum cymosum and Fagopyrum tataricum, but not Fagopyrum esculentum. The results are also verified through collecting and sequencing an extensive sampling of cultivated/wild populations across all environmentally distinct regions in which these species are found. The changes in flowering time and style morphology controlled by the AP1 and S-ELF3 loci significantly contribute to the buckwheat speciation. The introgression from Fagopyrum cymosum into wild Fagopyrum tataricum explains why wild Fagopyrum tataricum exhibits seed morphology similar to Fagopyrum cymosum. Furthermore, the convergent traits of leaf morphology and higher flavonoid content between Fagopyrum cymosum and wild Fagopyrum esculentum are linked to high-altitude adaptation. Fagopyrum cymosum is more closely related to wild Fagopyrum tataricum, a fact that is confirmed by interspecific hybridization. Our work provides a valuable example of how phylogenomics can be efficiently utilized for phylogenetic relationship analysis between crops and their wild species relatives, as well as elucidating the plant speciation from the perspectives of genomic evolution and adaptive mechanisms.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"93 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145195149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}