Xiaohan Xu, John Saxon, Megan Sioe Fei Soon, Colin YC Lee, Zewen Kelvin Tuong
{"title":"Data standards for single-cell RNA-sequencing of paediatric cancer","authors":"Xiaohan Xu, John Saxon, Megan Sioe Fei Soon, Colin YC Lee, Zewen Kelvin Tuong","doi":"10.1002/cti2.70033","DOIUrl":null,"url":null,"abstract":"<p>Single-cell RNA sequencing (scRNA-seq) is a powerful tool for investigating paediatric cancers, but individual studies often profile a small number of individuals. It is now the standard practice to upload the scRNA-seq data to data repositories to support scientific reproducibility. Public data deposition is a cost-effective and sustainability-conscious solution that allows any researcher to download and analyse existing scRNA-seq data to develop new ideas. This is incredibly valuable, especially in the context of paediatric cancer research, where access to funding and to patient cohorts may be prohibitive. However, standards for data deposition are absent, leading to significant issues that may slow progress. As a consequence, it is difficult, even impossible, for other researchers to validate findings or utilise these data for tailored analyses. Here, we systematically accessed and reviewed publicly available scRNA-seq data sets from various paediatric cancer studies, covering over 1.3 million cells across 488 clinical samples. We highlight striking inconsistencies with study design and data availability across several levels, which hinder downstream analyses and data reproducibility. To address these challenges, we propose a recommendations framework to improve data deposition practices that promote more effective use of scRNA-seq data sets deposited on public repositories and accelerate discoveries in paediatric cancer research and beyond. We urge data standards institutes and repositories, such as NCBI Gene Expression Omnibus (GEO) and European Genome-Phenome Archive (EGA), to strictly enforce these standardised data practices.</p>","PeriodicalId":152,"journal":{"name":"Clinical & Translational Immunology","volume":"14 5","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cti2.70033","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical & Translational Immunology","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cti2.70033","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"IMMUNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Single-cell RNA sequencing (scRNA-seq) is a powerful tool for investigating paediatric cancers, but individual studies often profile a small number of individuals. It is now the standard practice to upload the scRNA-seq data to data repositories to support scientific reproducibility. Public data deposition is a cost-effective and sustainability-conscious solution that allows any researcher to download and analyse existing scRNA-seq data to develop new ideas. This is incredibly valuable, especially in the context of paediatric cancer research, where access to funding and to patient cohorts may be prohibitive. However, standards for data deposition are absent, leading to significant issues that may slow progress. As a consequence, it is difficult, even impossible, for other researchers to validate findings or utilise these data for tailored analyses. Here, we systematically accessed and reviewed publicly available scRNA-seq data sets from various paediatric cancer studies, covering over 1.3 million cells across 488 clinical samples. We highlight striking inconsistencies with study design and data availability across several levels, which hinder downstream analyses and data reproducibility. To address these challenges, we propose a recommendations framework to improve data deposition practices that promote more effective use of scRNA-seq data sets deposited on public repositories and accelerate discoveries in paediatric cancer research and beyond. We urge data standards institutes and repositories, such as NCBI Gene Expression Omnibus (GEO) and European Genome-Phenome Archive (EGA), to strictly enforce these standardised data practices.
期刊介绍:
Clinical & Translational Immunology is an open access, fully peer-reviewed journal devoted to publishing cutting-edge advances in biomedical research for scientists and physicians. The Journal covers fields including cancer biology, cardiovascular research, gene therapy, immunology, vaccine development and disease pathogenesis and therapy at the earliest phases of investigation.