Chris Hartl, Jiali Zhuang, Aaron Tyler, Bing Zhou, Emily Wong, David Merberg, Brad Farrell, Chris DeBoever, Julie Bryant, Dorothée Diogo
{"title":"CREdb: A comprehensive database of Cis-Regulatory Elements and their activity in human cells and tissues.","authors":"Chris Hartl, Jiali Zhuang, Aaron Tyler, Bing Zhou, Emily Wong, David Merberg, Brad Farrell, Chris DeBoever, Julie Bryant, Dorothée Diogo","doi":"10.1186/s13072-024-00545-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Cis-regulatory elements (CREs) play a pivotal role in gene expression regulation, allowing cells to serve diverse functions and respond to external stimuli. Understanding CREs is essential for personalized medicine and disease research, as an increasing number of genetic variants associated with phenotypes and diseases overlap with CREs. However, existing databases often focus on subsets of regulatory elements and present each identified instance of element individually, confounding the effort to obtain a comprehensive view. To address this gap, we have created CREdb, a comprehensive database with over 10 million human regulatory elements across 1,058 cell types and 315 tissues harmonized from different data sources. We curated and aligned the cell types and tissues to standard ontologies for efficient data query.</p><p><strong>Results: </strong>Data from 11 sources were curated and mapped to standard ontological terms. 11,223,434 combined elements are present in the final database, and these were merged into 5,666,240 consensus elements representing the combined ranges of the individual elements informed by their overlap. Each consensus element contains curated metadata including the number of elements supporting it and a hash linking to the source databases. The inferred activity of each consensus element in various cell-type and tissue context is also provided. Examples presented here show the potential utility of CREdb in annotating non-coding genetic variants and informing chromatin accessibility profiling analysis.</p><p><strong>Conclusions: </strong>We developed CREdb, a comprehensive database of CREs, to simplify the analysis of CREs by providing a unified framework for researchers. CREdb compiles consensus ranges for each element by integrating the information from all instances identified across various source databases. This unified database facilitates the functional annotation of non-coding genetic variants and complements chromatin accessibility profiling analysis. CREdb will serve as an important resource in expanding our knowledge of the epigenome and its role in human diseases.</p>","PeriodicalId":49253,"journal":{"name":"Epigenetics & Chromatin","volume":"17 1","pages":"21"},"PeriodicalIF":4.2000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11253421/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epigenetics & Chromatin","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13072-024-00545-7","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Cis-regulatory elements (CREs) play a pivotal role in gene expression regulation, allowing cells to serve diverse functions and respond to external stimuli. Understanding CREs is essential for personalized medicine and disease research, as an increasing number of genetic variants associated with phenotypes and diseases overlap with CREs. However, existing databases often focus on subsets of regulatory elements and present each identified instance of element individually, confounding the effort to obtain a comprehensive view. To address this gap, we have created CREdb, a comprehensive database with over 10 million human regulatory elements across 1,058 cell types and 315 tissues harmonized from different data sources. We curated and aligned the cell types and tissues to standard ontologies for efficient data query.
Results: Data from 11 sources were curated and mapped to standard ontological terms. 11,223,434 combined elements are present in the final database, and these were merged into 5,666,240 consensus elements representing the combined ranges of the individual elements informed by their overlap. Each consensus element contains curated metadata including the number of elements supporting it and a hash linking to the source databases. The inferred activity of each consensus element in various cell-type and tissue context is also provided. Examples presented here show the potential utility of CREdb in annotating non-coding genetic variants and informing chromatin accessibility profiling analysis.
Conclusions: We developed CREdb, a comprehensive database of CREs, to simplify the analysis of CREs by providing a unified framework for researchers. CREdb compiles consensus ranges for each element by integrating the information from all instances identified across various source databases. This unified database facilitates the functional annotation of non-coding genetic variants and complements chromatin accessibility profiling analysis. CREdb will serve as an important resource in expanding our knowledge of the epigenome and its role in human diseases.
期刊介绍:
Epigenetics & Chromatin is a peer-reviewed, open access, online journal that publishes research, and reviews, providing novel insights into epigenetic inheritance and chromatin-based interactions. The journal aims to understand how gene and chromosomal elements are regulated and their activities maintained during processes such as cell division, differentiation and environmental alteration.