Marija Orlic-Milacic, Karen Rothfels, Lisa Matthews, Adam Wright, Bijay Jassal, Veronica Shamovsky, Quang Trinh, Marc E Gillespie, Cristoffer Sevilla, Krishna Tiwari, Eliot Ragueneau, Chuqiao Gong, Ralf Stephan, Bruce May, Robin Haw, Joel Weiser, Deidre Beavers, Patrick Conley, Henning Hermjakob, Lincoln D Stein, Peter D'Eustachio, Guanming Wu
{"title":"Pathway-based, reaction-specific annotation of disease variants for elucidation of molecular phenotypes.","authors":"Marija Orlic-Milacic, Karen Rothfels, Lisa Matthews, Adam Wright, Bijay Jassal, Veronica Shamovsky, Quang Trinh, Marc E Gillespie, Cristoffer Sevilla, Krishna Tiwari, Eliot Ragueneau, Chuqiao Gong, Ralf Stephan, Bruce May, Robin Haw, Joel Weiser, Deidre Beavers, Patrick Conley, Henning Hermjakob, Lincoln D Stein, Peter D'Eustachio, Guanming Wu","doi":"10.1093/database/baae031","DOIUrl":"10.1093/database/baae031","url":null,"abstract":"<p><p>Germline and somatic mutations can give rise to proteins with altered activity, including both gain and loss-of-function. The effects of these variants can be captured in disease-specific reactions and pathways that highlight the resulting changes to normal biology. A disease reaction is defined as an aberrant reaction in which a variant protein participates. A disease pathway is defined as a pathway that contains a disease reaction. Annotation of disease variants as participants of disease reactions and disease pathways can provide a standardized overview of molecular phenotypes of pathogenic variants that is amenable to computational mining and mathematical modeling. Reactome (https://reactome.org/), an open source, manually curated, peer-reviewed database of human biological pathways, in addition to providing annotations for >11 000 unique human proteins in the context of ∼15 000 wild-type reactions within more than 2000 wild-type pathways, also provides annotations for >4000 disease variants of close to 400 genes as participants of ∼800 disease reactions in the context of ∼400 disease pathways. Functional annotation of disease variants proceeds from normal gene functions, described in wild-type reactions and pathways, through disease variants whose divergence from normal molecular behaviors has been experimentally verified, to extrapolation from molecular phenotypes of characterized variants to variants of unknown significance using criteria of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Reactome's data model enables mapping of disease variant datasets to specific disease reactions within disease pathways, providing a platform to infer pathway output impacts of numerous human disease variants and model organism orthologs, complementing computational predictions of variant pathogenicity. Database URL: https://reactome.org/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11184451/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140876125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to: ESKtides: a comprehensive database and mining method for ESKAPE phage-derived antimicrobial peptides.","authors":"","doi":"10.1093/database/baae035","DOIUrl":"10.1093/database/baae035","url":null,"abstract":"","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11184445/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140891796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yue Liu, Yuhuan Zhou, Xiumei Hu, Wuri Le-Ge, Haoyan Wang, Tao Jiang, Junyi Li, Yang Hu, Yadong Wang
{"title":"DIRMC: a database of immunotherapy-related molecular characteristics.","authors":"Yue Liu, Yuhuan Zhou, Xiumei Hu, Wuri Le-Ge, Haoyan Wang, Tao Jiang, Junyi Li, Yang Hu, Yadong Wang","doi":"10.1093/database/baae032","DOIUrl":"10.1093/database/baae032","url":null,"abstract":"<p><p>Cancer immunotherapy has brought about a revolutionary breakthrough in the field of cancer treatment. Immunotherapy has changed the treatment landscape for a variety of solid and hematologic malignancies. To assist researchers in efficiently uncovering valuable information related to cancer immunotherapy, we have presented a manually curated comprehensive database called DIRMC, which focuses on molecular features involved in cancer immunotherapy. All the content was collected manually from published literature, authoritative clinical trial data submitted by clinicians, some databases for drug target prediction such as DrugBank, and some experimentally confirmed high-throughput data sets for the characterization of immune-related molecular interactions in cancer, such as a curated database of T-cell receptor sequences with known antigen specificity (VDJdb), a pathology-associated TCR database (McPAS-TCR) et al. By constructing a fully connected functional network, ranging from cancer-related gene mutations to target genes to translated target proteins to protein regions or sites that may specifically affect protein function, we aim to comprehensively characterize molecular features related to cancer immunotherapy. We have developed the scoring criteria to assess the reliability of each MHC-peptide-T-cell receptor (TCR) interaction item to provide a reference for users. The database provides a user-friendly interface to browse and retrieve data by genes, target proteins, diseases and more. DIRMC also provides a download and submission page for researchers to access data of interest for further investigation or submit new interactions related to cancer immunotherapy targets. Furthermore, DIRMC provides a graphical interface to help users predict the binding affinity between their own peptide of interest and MHC or TCR. This database will provide researchers with a one-stop resource to understand cancer immunotherapy-related targets as well as data on MHC-peptide-TCR interactions. It aims to offer reliable molecular characteristics support for both the analysis of the current status of cancer immunotherapy and the development of new immunotherapy. DIRMC is available at http://www.dirmc.tech/. Database URL: http://www.dirmc.tech/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11184449/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140876124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"cancercelllines.org—a novel resource for genomic variants in cancer cell lines","authors":"Rahel Paloots, Michael Baudis","doi":"10.1093/database/baae030","DOIUrl":"https://doi.org/10.1093/database/baae030","url":null,"abstract":"Cancer cell lines are an important component in biological and medical research, enabling studies of cellular mechanisms as well as the development and testing of pharmaceuticals. Genomic alterations in cancer cell lines are widely studied as models for oncogenetic events and are represented in a wide range of primary resources. We have created a comprehensive, curated knowledge resource—cancercelllines.org—with the aim to enable easy access to genomic profiling data in cancer cell lines, curated from a variety of resources and integrating both copy number and single nucleotide variants data. We have gathered over 5600 copy number profiles as well as single nucleotide variant annotations for 16 000 cell lines and provide these data with mappings to the GRCh38 reference genome. Both genomic variations and associated curated metadata can be queried through the GA4GH Beacon v2 Application Programming Interface (API) and a graphical user interface with extensive data retrieval enabled using GA4GH data schemas under a permissive licensing scheme. Database URL: https://cancercelllines.org","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140836575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laha Ale, Robert Gentleman, Teresa Filshtein Sonmez, Deepayan Sarkar, Christopher Endres
{"title":"nhanesA: achieving transparency and reproducibility in NHANES research","authors":"Laha Ale, Robert Gentleman, Teresa Filshtein Sonmez, Deepayan Sarkar, Christopher Endres","doi":"10.1093/database/baae028","DOIUrl":"https://doi.org/10.1093/database/baae028","url":null,"abstract":"The National Health and Nutrition Examination Survey provides comprehensive data on demographics, sociology, health and nutrition. Conducted in 2-year cycles since 1999, most of its data are publicly accessible, making it pivotal for research areas like studying social determinants of health or tracking trends in health metrics such as obesity or diabetes. Assembling the data and analyzing it presents a number of technical and analytic challenges. This paper introduces the nhanesA R package, which is designed to assist researchers in data retrieval and analysis and to enable the sharing and extension of prior research efforts. We believe that fostering community-driven activity in data reproducibility and sharing of analytic methods will greatly benefit the scientific community and propel scientific advancements. Database URL: https://github.com/cjendres1/nhanes","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140613965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pengyu Du, Yingli Chen, Qianzhong Li, Zhimin Gai, Hui Bai, Luqiang Zhang, Yuxian Liu, Yanni Cao, Yuanyuan Zhai, Wen Jin
{"title":"CancerMHL: the database of integrating key DNA methylation, histone modifications and lncRNAs in cancer","authors":"Pengyu Du, Yingli Chen, Qianzhong Li, Zhimin Gai, Hui Bai, Luqiang Zhang, Yuxian Liu, Yanni Cao, Yuanyuan Zhai, Wen Jin","doi":"10.1093/database/baae029","DOIUrl":"https://doi.org/10.1093/database/baae029","url":null,"abstract":"Abstract The discovery of key epigenetic modifications in cancer is of great significance for the study of disease biomarkers. Through the mining of epigenetic modification data relevant to cancer, some researches on epigenetic modifications are accumulating. In order to make it easier to integrate the effects of key epigenetic modifications on the related cancers, we established CancerMHL (http://www.positionprediction.cn/), which provide key DNA methylation, histone modifications and lncRNAs as well as the effect of these key epigenetic modifications on gene expression in several cancers. To facilitate data retrieval, CancerMHL offers flexible query options and filters, allowing users to access specific key epigenetic modifications according to their own needs. In addition, based on the epigenetic modification data, three online prediction tools had been offered in CancerMHL for users. CancerMHL will be a useful resource platform for further exploring novel and potential biomarkers and therapeutic targets in cancer. Database URL: http://www.positionprediction.cn/","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140712472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"STRIDE-DB: a comprehensive database for exploration of instability and phenotypic relevance of short tandem repeats in the human genome","authors":"Bharathram Uppili, Mohammed Faruq","doi":"10.1093/database/baae020","DOIUrl":"https://doi.org/10.1093/database/baae020","url":null,"abstract":"Short Tandem Repeats (STRs) are genetic markers made up of repeating DNA sequences. The variations of the STRs are widely studied in forensic analysis, population studies and genetic testing for a variety of neuromuscular disorders. Understanding polymorphic STR variation and its cause is crucial for deciphering genetic information and finding links to various disorders. In this paper, we present STRIDE-DB, a novel and unique platform to explore STR Instability and its Phenotypic Relevance, and a comprehensive database of STRs in the human genome. We utilized RepeatMasker to identify all the STRs in the human genome (hg19) and combined it with frequency data from the 1000 Genomes Project. STRIDE-DB, a user-friendly resource, plays a pivotal role in investigating the relationship between STR variation, instability and phenotype. By harnessing data from genome-wide association studies (GWAS), ClinVar database, Alu loci, Haploblocks in genome and Conservation of the STRs, it serves as an important tool for researchers exploring the variability of STRs in the human genome and its direct impact on phenotypes. STRIDE-DB has its broad applicability and significance in various research domains like forensic sciences and other repeat expansion disorders. Database URL: https://stridedb.igib.res.in.","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140589118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S Sivagnanam, S Yeu, K Lin, S Sakai, F Garzon, K Yoshimoto, K Prantzalos, D P Upadhyaya, A Majumdar, S S Sahoo, W W Lytton
{"title":"Towards building a trustworthy pipeline integrating Neuroscience Gateway and Open Science Chain","authors":"S Sivagnanam, S Yeu, K Lin, S Sakai, F Garzon, K Yoshimoto, K Prantzalos, D P Upadhyaya, A Majumdar, S S Sahoo, W W Lytton","doi":"10.1093/database/baae023","DOIUrl":"https://doi.org/10.1093/database/baae023","url":null,"abstract":"When the scientific dataset evolves or is reused in workflows creating derived datasets, the integrity of the dataset with its metadata information, including provenance, needs to be securely preserved while providing assurances that they are not accidentally or maliciously altered during the process. Providing a secure method to efficiently share and verify the data as well as metadata is essential for the reuse of the scientific data. The National Science Foundation (NSF) funded Open Science Chain (OSC) utilizes consortium blockchain to provide a cyberinfrastructure solution to maintain integrity of the provenance metadata for published datasets and provides a way to perform independent verification of the dataset while promoting reuse and reproducibility. The NSF- and National Institutes of Health (NIH)-funded Neuroscience Gateway (NSG) provides a freely available web portal that allows neuroscience researchers to execute computational data analysis pipeline on high performance computing resources. Combined, the OSC and NSG platforms form an efficient, integrated framework to automatically and securely preserve and verify the integrity of the artifacts used in research workflows while using the NSG platform. This paper presents the results of the first study that integrates OSC–NSG frameworks to track the provenance of neurophysiological signal data analysis to study brain network dynamics using the Neuro-Integrative Connectivity tool, which is deployed in the NSG platform. Database URL: https://www.opensciencechain.org.","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140588924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jorge Novoa, Javier López-Ibáñez, Mónica Chagoyen, Juan A G Ranea, Florencio Pazos
{"title":"CoMentG: comprehensive retrieval of generic relationships between biomedical concepts from the scientific literature","authors":"Jorge Novoa, Javier López-Ibáñez, Mónica Chagoyen, Juan A G Ranea, Florencio Pazos","doi":"10.1093/database/baae025","DOIUrl":"https://doi.org/10.1093/database/baae025","url":null,"abstract":"The CoMentG resource contains millions of relationships between terms of biomedical interest obtained from the scientific literature. At the core of the system is a methodology for detecting significant co-mentions of concepts in the entire PubMed corpus. That method was applied to nine sets of terms covering the most important classes of biomedical concepts: diseases, symptoms/clinical signs, molecular functions, biological processes, cellular compartments, anatomic parts, cell types, bacteria and chemical compounds. We obtained more than 7 million relationships between more than 74 000 terms, and many types of relationships were not available in any other resource. As the terms were obtained from widely used resources and ontologies, the relationships are given using the standard identifiers provided by them and hence can be linked to other data. A web interface allows users to browse these associations, searching for relationships for a set of terms of interests provided as input, such as between a disease and their associated symptoms, underlying molecular processes or affected tissues. The results are presented in an interactive interface where the user can explore the reported relationships in different ways and follow links to other resources. Database URL: https://csbg.cnb.csic.es/CoMentG/","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140588926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Victor Avram, Shweta Yadav, Pranav Sahasrabudhe, Dan Chang, Jing Wang
{"title":"IBDTransDB: a manually curated transcriptomic database for inflammatory bowel disease","authors":"Victor Avram, Shweta Yadav, Pranav Sahasrabudhe, Dan Chang, Jing Wang","doi":"10.1093/database/baae026","DOIUrl":"https://doi.org/10.1093/database/baae026","url":null,"abstract":"Inflammatory Bowel Disease (IBD) therapies are ineffective in at least 40% patients, and transcriptomic datasets have been widely used to reveal the pathogenesis and to identify the novel drug targets for these patients. Although public IBD transcriptomic datasets are available from many web-based tools/databases, due to the unstructured metadata and data description of these public datasets, most of these tools/databases do not allow querying datasets based on multiple keywords (e.g. colon and infliximab). Furthermore, few tools/databases can compare and integrate the datasets from the query results. To fill these gaps, we have developed IBDTransDB (https://abbviegrc.shinyapps.io/ibdtransdb/), a manually curated transcriptomic database for IBD. IBDTransDB includes a manually curated database with 34 transcriptomic datasets (2932 samples, 122 differential comparisons) and a query system supporting 35 keywords from 5 attributes (e.g. tissue and treatment). IBDTransDB also provides three modules for data analyses and integration. IBDExplore allows interactive visualization of differential gene list, pathway enrichment, gene signature and cell deconvolution analyses from a single dataset. IBDCompare supports comparisons of selected genes or pathways from multiple datasets across different conditions. IBDIntegrate performs meta-analysis to prioritize a list of genes/pathways based on user-selected datasets and conditions. Using two case studies related to infliximab treatment, we demonstrated that IBDTransDB provides a unique platform for biologists and clinicians to reveal IBD pathogenesis and identify the novel targets by integrating with other omics data. Database URL: https://abbviegrc.shinyapps.io/ibdtransdb/","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140588778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}