BMC genomic dataPub Date : 2024-06-06DOI: 10.1186/s12863-024-01240-y
Robert W Meredith, Yoamel Milián-García, John Gatesy, Michael A Russello, George Amato
{"title":"Draft assembly and annotation of the Cuban crocodile (Crocodylus rhombifer) genome.","authors":"Robert W Meredith, Yoamel Milián-García, John Gatesy, Michael A Russello, George Amato","doi":"10.1186/s12863-024-01240-y","DOIUrl":"10.1186/s12863-024-01240-y","url":null,"abstract":"<p><strong>Objectives: </strong>The new data provide an important genomic resource for the Critically Endangered Cuban crocodile (Crocodylus rhombifer). Cuban crocodiles are restricted to the Zapata Swamp in southern Matanzas Province, Cuba, and readily hybridize with the widespread American crocodile (Crocodylus acutus) in areas of sympatry. The reported de novo assembly will contribute to studies of crocodylian evolutionary history and provide a resource for informing Cuban crocodile conservation.</p><p><strong>Data description: </strong>The final 2.2 Gb draft genome for C. rhombifer consists of 41,387 scaffolds (contigs: N50 = 104.67 Kb; scaffold: N50-518.55 Kb). Benchmarking Universal Single-Copy Orthologs (BUSCO) identified 92.3% of the 3,354 genes in the vertebrata_odb10 database. Approximately 42% of the genome (960Mbp) comprises repeat elements. We predicted 30,138 unique protein-coding sequences (17,737 unique genes) in the genome assembly. Functional annotation found the top Gene Ontology annotations for Biological Processes, Molecular Function, and Cellular Component were regulation, protein, and intracellular, respectively. This assembly will support future macroevolutionary, conservation, and molecular studies of the Cuban crocodile.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"53"},"PeriodicalIF":0.0,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11157745/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141285539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BMC genomic dataPub Date : 2024-06-06DOI: 10.1186/s12863-024-01238-6
Chun Hing She, Hing Wai Tsang, Xingtian Yang, Sabrina Sl Tsao, Clara Sm Tang, Sophelia Hs Chan, Mike Yw Kwan, Gilbert T Chua, Wanling Yang, Patrick Ip
{"title":"Genome-wide association study of BNT162b2 vaccine-related myocarditis identifies potential predisposing functional areas in Hong Kong adolescents.","authors":"Chun Hing She, Hing Wai Tsang, Xingtian Yang, Sabrina Sl Tsao, Clara Sm Tang, Sophelia Hs Chan, Mike Yw Kwan, Gilbert T Chua, Wanling Yang, Patrick Ip","doi":"10.1186/s12863-024-01238-6","DOIUrl":"10.1186/s12863-024-01238-6","url":null,"abstract":"<p><p>Vaccine-related myocarditis associated with the BNT162b2 vaccine is a rare complication, with a higher risk observed in male adolescents. However, the contribution of genetic factors to this condition remains uncertain. In this study, we conducted a comprehensive genetic association analysis in a cohort of 43 Hong Kong Chinese adolescents who were diagnosed with myocarditis shortly after receiving the BNT162b2 mRNA COVID-19 vaccine. A comparison of whole-genome sequencing data was performed between the confirmed myocarditis cases and a control group of 481 healthy individuals. To narrow down potential genomic regions of interest, we employed a novel clustering approach called ClusterAnalyzer, which prioritised 2,182 genomic regions overlapping with 1,499 genes for further investigation. Our pathway analysis revealed significant enrichment of these genes in functions related to cardiac conduction, ion channel activity, plasma membrane adhesion, and axonogenesis. These findings suggest a potential genetic predisposition in these specific functional areas that may contribute to the observed side effect of the vaccine. Nevertheless, further validation through larger-scale studies is imperative to confirm these findings. Given the increasing prominence of mRNA vaccines as a promising strategy for disease prevention and treatment, understanding the genetic factors associated with vaccine-related myocarditis assumes paramount importance. Our study provides valuable insights that significantly advance our understanding in this regard and serve as a valuable foundation for future research endeavours in this field.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"51"},"PeriodicalIF":0.0,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11155081/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141285541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantifying variations associated with dental caries reveals disparity in effect allele frequencies across diverse populations.","authors":"Sangram Sandhu, Varun Sharma, Sachin Kumar, Niraj Rai, Pooran Chand","doi":"10.1186/s12863-024-01215-z","DOIUrl":"10.1186/s12863-024-01215-z","url":null,"abstract":"<p><strong>Background: </strong>Dental caries (DC) is a multifaceted oral condition influenced by genetic and environmental factors. Recent advancements in genotyping and sequencing technologies, such as Genome-Wide Association Studies (GWAS) have helped researchers to identify numerous genetic variants associated with DC, but their prevalence and significance across diverse global populations remain poorly understood as most of the studies were conducted in European populations, and very few were conducted in Asians specifically in Indians.</p><p><strong>Aim: </strong>This study aimed to evaluate the genetic affinity of effect alleles associated with DC to understand the genetic relationship between global populations with respect to the Indian context.</p><p><strong>Methodology: </strong>This present study used an empirical approach in which variants associated with DC susceptibility were selected. These variants were identified and annotated using the GWAS summary. The genetic affinity was evaluated using Fst.</p><p><strong>Results: </strong>The effect of allele frequencies among different populations was examined, revealing variations in allele distribution. African populations exhibited higher frequencies of specific risk alleles, whereas East Asian and European populations displayed distinct profiles. South Asian populations showed a unique genetic cluster.</p><p><strong>Conclusion: </strong>Our study emphasises the complex genetic landscape of DC and highlights the need for population-specific research as well as validation of GWAS-identified markers in Indians before defining them as established candidate genes.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"50"},"PeriodicalIF":1.9,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11149341/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141238857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparative chloroplast genomics and phylogenetic analysis of Oreomecon nudicaulis (Papaveraceae).","authors":"Qingbin Zhan, Yalin Huang, Xiaoming Xue, Yunxia Chen","doi":"10.1186/s12863-024-01236-8","DOIUrl":"10.1186/s12863-024-01236-8","url":null,"abstract":"<p><p>Oreomecon nudicaulis, commonly known as mountain poppy, is a significant perennial herb. In 2022, the species O. nudicaulis, which was previously classified under the genus Papaver, was reclassified within the genus Oreomecon. Nevertheless, the phylogenetic status and chloroplast genome within the genus Oreomecon have not yet been reported. This study elucidates the chloroplast genome sequence and structural features of O. nudicaulis and explores its evolutionary relationships within Papaveraceae. Using Illumina sequencing technology, the chloroplast genome of O. nudicaulis was sequenced, assembled, and annotated. The results indicate that the chloroplast genome of O. nudicaulis exhibits a typical circular quadripartite structure. The chloroplast genome is 153,903 bp in length, with a GC content of 38.87%, containing 84 protein-coding genes, 8 rRNA genes, 38 tRNA genes, and 2 pseudogenes. The genome encodes 25,815 codons, with leucine (Leu) being the most abundant codon, and the most frequently used codon is AUU. Additionally, 129 microsatellite markers were identified, with mononucleotide repeats being the most abundant (53.49%). Our phylogenetic analysis revealed that O. nudicaulis has a relatively close relationship with the genus Meconopsis within the Papaveraceae family. The phylogenetic analysis supported the taxonomic status of O. nudicaulis, as it did not form a clade with other Papaver species, consistent with the revised taxonomy of Papaveraceae. This is the first report of a phylogenomic study of the complete chloroplast genome in the genus Oreomecon, which is a significant genus worldwide. This analysis of the O. nudicaulis chloroplast genome provides a theoretical basis for research on genetic diversity, molecular marker development, and species identification, enriching genetic information and supporting the evolutionary relationships among Papaveraceae.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"49"},"PeriodicalIF":0.0,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141030/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141181657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BMC genomic dataPub Date : 2024-05-23DOI: 10.1186/s12863-024-01225-x
Inbar Cohen-Gihon, Galia Zaide, Sharon Amit, Iris Zohar, Orna Schwartz, Yasmin Maor, Ofir Israeli, Gal Bilinsky, Ma'ayan Israeli, Shirley Lazar, David Gur, Moshe Aftalion, Anat Zvi, Adi Beth-Din, Erez Bar-Haim, Uri Elia, Ofer Cohen, Emanuelle Mamroud, Theodor Chitlaru
{"title":"Genome sequence of two novel virulent clinical strains of Burkholderia pseudomallei isolated from acute melioidosis cases imported to Israel from India and Thailand.","authors":"Inbar Cohen-Gihon, Galia Zaide, Sharon Amit, Iris Zohar, Orna Schwartz, Yasmin Maor, Ofir Israeli, Gal Bilinsky, Ma'ayan Israeli, Shirley Lazar, David Gur, Moshe Aftalion, Anat Zvi, Adi Beth-Din, Erez Bar-Haim, Uri Elia, Ofer Cohen, Emanuelle Mamroud, Theodor Chitlaru","doi":"10.1186/s12863-024-01225-x","DOIUrl":"10.1186/s12863-024-01225-x","url":null,"abstract":"<p><strong>Objective: </strong>Burkholderia pseudomallei, the etiological cause of melioidosis, is a soil saprophyte endemic in South-East Asia, where it constitutes a public health concern of high-priority. Melioidosis cases are sporadically identified in nonendemic areas, usually associated with travelers or import of goods from endemic regions. Due to extensive intercontinental traveling and the anticipated climate change-associated alterations of the soil bacterial flora, there is an increasing concern for inadvertent establishment of novel endemic areas, which may expand the global burden of melioidosis. Rapid diagnosis, isolation and characterization of B. pseudomallei isolates is therefore of utmost importance particularly in non-endemic locations.</p><p><strong>Data description: </strong>We report the genome sequences of two novel clinical isolates (MWH2021 and MST2022) of B. pseudomallei identified in distinct acute cases of melioidosis diagnosed in two individuals arriving to Israel from India and Thailand, respectively. The data includes preliminary genetic analysis of the genomes determining their phylogenetic classification in rapport to the genomes of 131 B. pseudomallei strains documented in the NCBI database. Inspection of the genomic data revealed the presence or absence of loci encoding for several documented virulence determinants involved in the molecular pathogenesis of melioidosis. Virulence analysis in murine models of acute or chronic melioidosis established that both strains belong to the highly virulent class of B. pseudomalleii.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"47"},"PeriodicalIF":0.0,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11118722/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141089366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BMC genomic dataPub Date : 2024-05-23DOI: 10.1186/s12863-024-01231-z
Yi Zhang, Endian Yang, Qin Liu, Jie Zhang, Chen Feng
{"title":"Combined full-length transcriptomic and metabolomic analysis reveals the molecular mechanisms underlying nutrients and taste components development in Primulina juliae.","authors":"Yi Zhang, Endian Yang, Qin Liu, Jie Zhang, Chen Feng","doi":"10.1186/s12863-024-01231-z","DOIUrl":"10.1186/s12863-024-01231-z","url":null,"abstract":"<p><strong>Background: </strong>Primulina juliae has recently emerged as a novel functional vegetable, boasting a significant biomass and high calcium content. Various breeding strategies have been employed to the domestication of P. juliae. However, the absence of genome and transcriptome information has hindered the research of mechanisms governing the taste and nutrients in this plant. In this study, we conducted a comprehensive analysis, combining the full-length transcriptomics and metabolomics, to unveil the molecular mechanisms responsible for the development of nutrients and taste components in P. juliae.</p><p><strong>Results: </strong>We obtain a high-quality reference transcriptome of P. juliae by combing the PacBio Iso-seq and Illumina sequencing technologies. A total of 58,536 cluster consensus sequences were obtained, including 28,168 complete protein coding transcripts and 8,021 Long Non-coding RNAs. Significant differences were observed in the composition and content of compounds related to nutrients and taste, particularly flavonoids, during the leaf development. Our results showed a decrease in the content of most flavonoids as leaves develop. Malate and succinate accumulated with leaf development, while some sugar metabolites were decreased. Furthermore, we identified the different accumulation of amino acids and fatty acids, which are associated with taste traits. Moreover, our transcriptomic analysis provided a molecular basis for understanding the metabolic variations during leaf development. We identified 4,689 differentially expressed genes in the two developmental stages, and through a comprehensive transcriptome and metabolome analysis, we discovered the key structure genes and transcription factors involved in the pathways.</p><p><strong>Conclusions: </strong>This study provides a high-quality reference transcriptome and reveals molecular mechanisms associated with the development of nutrients and taste components in P. juliae. These findings will enhance our understanding of the breeding and utilization of P. juliae as a vegetable.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"46"},"PeriodicalIF":0.0,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11112898/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141089363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BMC genomic dataPub Date : 2024-05-23DOI: 10.1186/s12863-024-01230-0
Zheng-Feng Wang, Lin-Fang Wu, Lei Chen, Wei-Guang Zhu, En-Ping Yu, Feng-Xia Xu, Hong-Lin Cao
{"title":"Genome assembly of Ottelia alismoides, a multiple-carbon utilisation aquatic plant.","authors":"Zheng-Feng Wang, Lin-Fang Wu, Lei Chen, Wei-Guang Zhu, En-Ping Yu, Feng-Xia Xu, Hong-Lin Cao","doi":"10.1186/s12863-024-01230-0","DOIUrl":"10.1186/s12863-024-01230-0","url":null,"abstract":"<p><strong>Objectives: </strong>Ottelia Pers. is in the Hydrocharitaceae family. Species in the genus are aquatic, and China is their centre of origin in Asia. Ottelia alismoides (L.) Pers., which is distributed worldwide, is a distinguishing element in China, while other species of this genus are endemic to China. However, O. alismoides is also considered endangered due to habitat loss and pollution in some Asian countries. Ottelia alismoides is the only submerged macrophyte that contains three carbon dioxide-concentrating mechanisms, i.e. bicarbonate (HCO<sub>3</sub><sup>-</sup>) use, crassulacean acid metabolism and the C4 pathway. In this study, we present its first genome assembly to help illustrate the various carbon metabolism mechanisms and to enable genetic conservation in the future.</p><p><strong>Data description: </strong>Using DNA and RNA extracted from one O. alismoides leaf, this work produced ∼ 73.4 Gb HiFi reads, ∼ 126.4 Gb whole genome sequencing short reads and ∼ 21.9 Gb RNA-seq reads. The de novo genome assembly was 6,455,939,835 bp in length, with 11,923 scaffolds/contigs and an N50 of 790,733 bp. Genome assembly completeness assessment with Benchmarking Universal Single-Copy Orthologs revealed a score of 94.4%. The repetitive sequence in the assembly was 4,875,817,144 bp (75.5%). A total of 116,176 genes were predicted. The protein sequences were functionally annotated against multiple databases, facilitating comparative genomic analysis.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"48"},"PeriodicalIF":1.9,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11118731/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141089365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BMC genomic dataPub Date : 2024-05-09DOI: 10.1186/s12863-024-01218-w
Qadrul Nisa, Gazala Gulzar, Mohammad Saleem Dar, Efath Shahnaz, Saba Banday, Zahoor A Bhat, Mohamed A El-Sheikh, Sajad Un Nabi, Vivak M Arya, Ali Anwar, Sheikh Mansoor
{"title":"New reports of pathogen spectrum associated with bulb rot and their interactions during the development of rot in tulip.","authors":"Qadrul Nisa, Gazala Gulzar, Mohammad Saleem Dar, Efath Shahnaz, Saba Banday, Zahoor A Bhat, Mohamed A El-Sheikh, Sajad Un Nabi, Vivak M Arya, Ali Anwar, Sheikh Mansoor","doi":"10.1186/s12863-024-01218-w","DOIUrl":"10.1186/s12863-024-01218-w","url":null,"abstract":"<p><p>Bulb rot, a highly damaging disease of tulip plants, has hindered their profitable cultivation worldwide. This rot occurs in both field and storage conditions posing significant challenges. While this disease has been attributed to a range of pathogens, previous investigations have solely examined it within the framework of a single-pathogen disease model. Our study took a different approach and identified four pathogens associated with the disease: Fusarium solani, Penicillium chrysogenum, Botrytis tulipae, and Aspergillus niger. The primary objective of our research was to examine the impact of co-infections on the overall virulence dynamics of these pathogens. Through co-inoculation experiments on potato dextrose agar, we delineated three primary interaction patterns: antibiosis, deadlock, and merging. In vitro trials involving individual pathogen inoculations on tulip bulbs revealed that B. tulipae,was the most virulent and induced complete bulb decay. Nonetheless, when these pathogens were simultaneously introduced in various combinations, outcomes ranged from partial bulb decay to elongated rotting periods. This indicated a notable degree of antagonistic behaviour among the pathogens. While synergistic interactions were evident in a few combinations, antagonism overwhelmingly prevailed. The complex interplay of these pathogens during co-infection led to a noticeable change in the overall severity of the disease. This underscores the significance of pathogen-pathogen interactions in the realm of plant pathology, opening new insights for understanding and managing tulip bulb rot.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"40"},"PeriodicalIF":0.0,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11080242/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140900659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparative transcriptome analysis between two different cadmium-accumulating genotypes of soybean (Glycine max) in response to cadmium stress.","authors":"Xiaoqing Liu, Hongmei Zhang, Wei Zhang, Qianru Jia, Xin Chen, Huatao Chen","doi":"10.1186/s12863-024-01226-w","DOIUrl":"10.1186/s12863-024-01226-w","url":null,"abstract":"<p><strong>Background: </strong>Cadmium (Cd) is extremely toxic and non-essential for plants. Different soybean varieties differ greatly in their Cd accumulation ability, but little is known about the underlying molecular mechanisms.</p><p><strong>Results: </strong>Here, we performed transcriptomic analysis using Illumina pair-end sequencing on root tissues from two soybean varieties (su8, high-Cd-accumulating (HAS) and su7, low Cd-accumulating (LAS)) grown with 0 or 50 μM CdSO<sub>4</sub>. A total of 18.76 million clean reads from the soybean root samples were obtained after quality assessment and data filtering. After Cd treatment, 739 differentially expressed genes (DEGs; 265 up and 474 down) were found in HAS; however, only 259 DEGs (88 up and 171 down) were found in LAS, and 64 genes were same between the two varieties. Pathway enrichment analysis suggested that after cadmium treatment, the DEGs between LAS and HAS were mainly enriched in glutathione metabolism and plant-pathogen interaction pathways. KEGG analysis showed that phenylalanine metabolism responding to cadmium stress in LAS, while ABC transporters responding to cadmium stress in HAS. Besides we found more differential expressed heavy metal transporters such as ABC transporters and zinc transporters in HAS than LAS, and there were more transcription factors differently expressed in HAS than LAS after cadmium treatment in two soybean varieties, eg. bHLH transcription factor, WRKY transcription factor and ZIP transcription factor.</p><p><strong>Conclusions: </strong>Findings from this study will shed new insights on the underlying molecular mechanisms behind the Cd accumulation in soybean.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"43"},"PeriodicalIF":0.0,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11075288/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140872248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BMC genomic dataPub Date : 2024-05-07DOI: 10.1186/s12863-024-01223-z
Grant C O'Connell
{"title":"Dataset including whole blood gene expression profiles and matched leukocyte counts with utility for benchmarking cellular deconvolution pipelines.","authors":"Grant C O'Connell","doi":"10.1186/s12863-024-01223-z","DOIUrl":"10.1186/s12863-024-01223-z","url":null,"abstract":"<p><strong>Objectives: </strong>Cellular deconvolution is a valuable computational process that can infer the cellular composition of heterogeneous tissue samples from bulk RNA-sequencing data. Benchmark testing is a crucial step in the development and evaluation of new cellular deconvolution algorithms, and also plays a key role in the process of building and optimizing deconvolution pipelines for specific experimental applications. However, few in vivo benchmarking datasets exist, particularly for whole blood, which is the single most profiled human tissue. Here, we describe a unique dataset containing whole blood gene expression profiles and matched circulating leukocyte counts from a large cohort of human donors with utility for benchmarking cellular deconvolution pipelines.</p><p><strong>Data description: </strong>To produce this dataset, venous whole blood was sampled from 138 total donors recruited at an academic medical center. Genome-wide expression profiling was subsequently performed via next-generation RNA sequencing, and white blood cell differentials were collected in parallel using flow cytometry. The resultant final dataset contains donor-level expression data for over 45,000 protein coding and non-protein coding genes, as well as matched neutrophil, lymphocyte, monocyte, and eosinophil counts.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"45"},"PeriodicalIF":0.0,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11077736/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140878048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}