Pankaj Agrawal, Klaus E Schmitz-Abe, Qifei Li, Sunny Greene, Michela Borrelli, Shiyu Luo, Madesh Ramesh
{"title":"Unique Signatures of Highly Constrained Genes Across Publicly Available Genomic Databases","authors":"Pankaj Agrawal, Klaus E Schmitz-Abe, Qifei Li, Sunny Greene, Michela Borrelli, Shiyu Luo, Madesh Ramesh","doi":"10.1101/2024.09.05.611529","DOIUrl":null,"url":null,"abstract":"Publicly available genomic databases and genetic constraint scores are crucial in understanding human population variation and the identification of variants that are likely to have a deleterious impact causing human disease. We utilized the one of largest publicly available databases, gnomAD, to determine genes that are highly constrained for only LoF, only missense, and both LoF/missense variants, identified their unique signatures, and explored their causal relationship with human conditions. Those genes were evaluated for unique patterns including their chromosomal location, tissue level expression, gene ontology analysis, and gene family categorization using multiple publicly available databases. Those highly constrained genes associated with human disease, we identified unique patterns of inheritance, protein size, and enrichment in distinct molecular pathways. In addition, we identified a cohort of highly constrained genes that are currently not known to cause human disease, that we suggest will be candidates to pursue as novel disease-associated genes. In summary, these insights not only elucidate biological pathways of highly constrained genes that expand our understanding of critical cellular proteins but also advance research in rare diseases.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Genetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.05.611529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Publicly available genomic databases and genetic constraint scores are crucial in understanding human population variation and the identification of variants that are likely to have a deleterious impact causing human disease. We utilized the one of largest publicly available databases, gnomAD, to determine genes that are highly constrained for only LoF, only missense, and both LoF/missense variants, identified their unique signatures, and explored their causal relationship with human conditions. Those genes were evaluated for unique patterns including their chromosomal location, tissue level expression, gene ontology analysis, and gene family categorization using multiple publicly available databases. Those highly constrained genes associated with human disease, we identified unique patterns of inheritance, protein size, and enrichment in distinct molecular pathways. In addition, we identified a cohort of highly constrained genes that are currently not known to cause human disease, that we suggest will be candidates to pursue as novel disease-associated genes. In summary, these insights not only elucidate biological pathways of highly constrained genes that expand our understanding of critical cellular proteins but also advance research in rare diseases.