{"title":"RBPWorld for exploring functions and disease associations of RNA-binding proteins across species.","authors":"Jian-You Liao, Bing Yang, Chuan-Ping Shi, Wei-Xi Deng, Jin-Si Deng, Mei-Feng Cen, Bing-Qi Zheng, Zi-Ling Zhan, Qiao-Ling Liang, Ji-En Wang, Shuang Tao, Daning Lu, Maojin Liang, Yu-Chan Zhang, Dong Yin","doi":"10.1093/nar/gkae1028","DOIUrl":"https://doi.org/10.1093/nar/gkae1028","url":null,"abstract":"<p><p>RNA-binding proteins (RBPs) play key roles in a wide range of physiological and pathological processes. To facilitate the investigation of RBP functions and disease associations, we updated the EuRBPDB and renamed it as RBPWorld (http://research.gzsys.org.cn/rbpworld/#/home). Leveraging 998 RNA-binding domains (RBDs) and 87 RNA-binding Proteome (RBPome) datasets, we successfully identified 1 393 413 RBPs from 445 species, including 3030 human RBPs (hRBPs). RBPWorld includes primary RNA targets of diverse hRBPs, as well as potential downstream regulatory pathways and alternative splicing patterns governed by various hRBPs. These insights were derived from analyses of 1515 crosslinking immunoprecipitation-seq datasets and 616 RNA-seq datasets from cells with hRBP gene knockdown or knockout. Furthermore, we systematically identified 929 RBPs with multi-functions, including acting as metabolic enzymes and transcription factors. RBPWorld includes 838 disease-associated hRBPs and 970 hRBPs that interact with 12 disease-causing RNA viruses. This provision allows users to explore the regulatory roles of hRBPs within the context of diseases. Finally, we developed an intuitive interface for RBPWorld, facilitating users easily access all the included data. We believe that RBPWorld will be a valuable resource in advancing our understanding of the biological roles of RBPs across different species.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142576719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Supratim Mukherjee, Dimitri Stamatis, Cindy Tianqing Li, Galina Ovchinnikova, Mahathi Kandimalla, Van Handke, Anuha Reddy, Natalia Ivanova, Tanja Woyke, Emiley A Eloe-Fardosh, I-Min A Chen, Nikos C Kyrpides, T B K Reddy
{"title":"Genomes OnLine Database (GOLD) v.10: new features and updates.","authors":"Supratim Mukherjee, Dimitri Stamatis, Cindy Tianqing Li, Galina Ovchinnikova, Mahathi Kandimalla, Van Handke, Anuha Reddy, Natalia Ivanova, Tanja Woyke, Emiley A Eloe-Fardosh, I-Min A Chen, Nikos C Kyrpides, T B K Reddy","doi":"10.1093/nar/gkae1000","DOIUrl":"https://doi.org/10.1093/nar/gkae1000","url":null,"abstract":"<p><p>The Genomes OnLine Database (GOLD; https://gold.jgi.doe.gov/) at the Department of Energy Joint Genome Institute is a comprehensive online metadata repository designed to catalog and manage information related to (meta)genomic sequence projects. GOLD provides a centralized platform where researchers can access a wide array of metadata from its four organization levels namely Study, Organism/Biosample, Sequencing Project and Analysis Project. GOLD continues to serve as a valuable resource and has seen significant growth and expansion since its inception in 1997. With its expanded role as a collaborative platform, it not only actively imports data from other primary repositories like National Center for Biotechnology Information but also supports contributions from researchers worldwide. This collaborative approach has enriched the database with diverse datasets, creating a more integrated resource to enhance scientific insights. As genomic research becomes increasingly integral to various scientific disciplines, more researchers and institutions are turning to GOLD for their metadata needs. To meet this growing demand, GOLD has expanded by adding diverse metadata fields, intuitive features, advanced search capabilities and enhanced data visualization tools, making it easier for users to find and interpret relevant information. This manuscript provides an update and highlights the new features introduced over the last 2 years.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142576718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lauren N McKinley, McCauley O Meyer, Aswathy Sebastian, Benjamin K Chang, Kyle J Messina, Istvan Albert, Philip C Bevilacqua
{"title":"Direct testing of natural twister ribozymes from over a thousand organisms reveals a broad tolerance for structural imperfections.","authors":"Lauren N McKinley, McCauley O Meyer, Aswathy Sebastian, Benjamin K Chang, Kyle J Messina, Istvan Albert, Philip C Bevilacqua","doi":"10.1093/nar/gkae908","DOIUrl":"10.1093/nar/gkae908","url":null,"abstract":"<p><p>Twister ribozymes are an extensively studied class of nucleolytic RNAs. Thousands of natural twisters have been proposed using sequence homology and structural descriptors. Yet, most of these candidates have not been validated experimentally. To address this gap, we developed Cleavage High-Throughput Assay (CHiTA), a high-throughput pipeline utilizing massively parallel oligonucleotide synthesis and next-generation sequencing to test putative ribozymes en masse in a scarless fashion. As proof of principle, we applied CHiTA to a small set of known active and mutant ribozymes. We then used CHiTA to test two large sets of naturally occurring twister ribozymes: over 1600 previously reported putative twisters and ∼1000 new candidate twisters. The new candidates were identified computationally in ∼1000 organisms, representing a massive increase in the number of ribozyme-harboring organisms. Approximately 94% of the twisters we tested were active and cleaved site-specifically. Analysis of their structural features revealed that many substitutions and helical imperfections can be tolerated. We repeated our computational search with structural descriptors updated from this analysis, whereupon we identified and confirmed the first intrinsically active twister ribozyme in mammals. CHiTA broadly expands the number of active twister ribozymes found in nature and provides a powerful method for functional analyses of other RNAs.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142576715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yue Feng, Ammar Qaseem, Aurélien F A Moumbock, Shuling Pan, Pascal A Kirchner, Conrad V Simoben, Yvette I Malange, Smith B Babiaka, Mingjie Gao, Stefan Günther
{"title":"StreptomeDB 4.0: a comprehensive database of streptomycetes natural products enriched with protein interactions and interactive spectral visualization.","authors":"Yue Feng, Ammar Qaseem, Aurélien F A Moumbock, Shuling Pan, Pascal A Kirchner, Conrad V Simoben, Yvette I Malange, Smith B Babiaka, Mingjie Gao, Stefan Günther","doi":"10.1093/nar/gkae1030","DOIUrl":"https://doi.org/10.1093/nar/gkae1030","url":null,"abstract":"<p><p>Streptomycetes remain an important bacterial source of natural products (NPs) with significant therapeutic promise, particularly in the fight against antimicrobial resistance. Herein, we present StreptomeDB 4.0, a substantial update of the database that includes expanded content and several new features. Currently, StreptomeDB 4.0 contains over 8500 NPs originating from ∼3900 streptomycetes, manually annotated from ∼7600 PubMed-indexed peer-reviewed articles. The database was enhanced by two in-house developments: (i) automated literature-mined NP-protein relationships (hyperlinked to the CPRiL web server) and (ii) pharmacophore-based NP-protein interactions (predicted with the ePharmaLib dataset). Moreover, genome mining was supplemented through hyperlinks to the widely used antiSMASH database. To facilitate NP structural dereplication, interactive visualization tools were implemented, namely the JSpecView applet and plotly.js charting library for predicted nuclear magnetic resonance and mass spectrometry spectral data, respectively. Furthermore, both the backend database and the frontend web interface were redesigned, and several software packages, including PostgreSQL and Django, were updated to the latest versions. Overall, this comprehensive database serves as a vital resource for researchers seeking to delve into the metabolic intricacies of streptomycetes and discover novel therapeutics, notably antimicrobial agents. StreptomeDB is publicly accessible at https://www.pharmbioinf.uni-freiburg.de/streptomedb.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142576721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CTR-DB 2.0: an updated cancer clinical transcriptome resource, expanding primary drug resistance and newly adding acquired resistance datasets and enhancing the discovery and validation of predictive biomarkers.","authors":"Jianzhou Jiang, Yajie Ma, Lele Yang, Shurui Ma, Zixuan Yu, Xinyi Ren, Xiangya Kong, Xinlei Zhang, Dong Li, Zhongyang Liu","doi":"10.1093/nar/gkae993","DOIUrl":"https://doi.org/10.1093/nar/gkae993","url":null,"abstract":"<p><p>Drug resistance is a principal limiting factor in cancer treatment. CTR-DB, the Cancer Treatment Response gene signature DataBase, is the first data resource for clinical transcriptomes with cancer treatment response, and meanwhile supports various data analysis functions, providing insights into the molecular determinants of drug resistance. Here we proposed an upgraded version, CTR-DB 2.0 (http://ctrdb.ncpsb.org.cn). Around 190 up-to-date source datasets with primary resistance information (129% increase compared to version 1.0) and 13 acquired-resistant datasets (a new dataset type), covering 10 856 patient samples (111% increase), 39 cancer types (39% increase) and 346 therapeutic regimens (26% increase), have been collected. In terms of function, for the single dataset analysis and multiple-dataset comparison modules, CTR-DB 2.0 added new gene set enrichment, tumor microenvironment (TME) and signature connectivity analysis functions to help elucidate drug resistance mechanisms and their homogeneity/heterogeneity and discover candidate combinational therapies. Furthermore, biomarker-related functions were greatly extended. CTR-DB 2.0 newly supported the validation of cell types in the TME as predictive biomarkers of treatment response, especially the validation of a combinational biomarker panel and even the direct discovery of the optimal biomarker panel using user-customized CTR-DB patient samples. In addition, the analysis of users' own datasets, application programming interface and data crowdfunding were also added.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142569179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raktim Mitra, Ari S Cohen, Jared M Sagendorf, Helen M Berman, Remo Rohs
{"title":"DNAproDB: an updated database for the automated and interactive analysis of protein-DNA complexes.","authors":"Raktim Mitra, Ari S Cohen, Jared M Sagendorf, Helen M Berman, Remo Rohs","doi":"10.1093/nar/gkae970","DOIUrl":"https://doi.org/10.1093/nar/gkae970","url":null,"abstract":"<p><p>DNAproDB (https://dnaprodb.usc.edu/) is a database, visualization tool, and processing pipeline for analyzing structural features of protein-DNA interactions. Here, we present a substantially updated version of the database through additional structural annotations, search, and user interface functionalities. The update expands the number of pre-analyzed protein-DNA structures, which are automatically updated weekly. The analysis pipeline identifies water-mediated hydrogen bonds that are incorporated into the visualizations of protein-DNA complexes. Tertiary structure-aware nucleotide layouts are now available. New file formats and external database annotations are supported. The website has been redesigned, and interacting with graphs and data is more intuitive. We also present a statistical analysis on the updated collection of structures revealing salient patterns in protein-DNA interactions.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142569214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhihan Ruan, Fan Lin, Zhenjie Zhang, Jiayue Cao, Wenting Xiang, Xiaoyi Wei, Jian Liu
{"title":"Pairpot: a database with real-time lasso-based analysis tailored for paired single-cell and spatial transcriptomics.","authors":"Zhihan Ruan, Fan Lin, Zhenjie Zhang, Jiayue Cao, Wenting Xiang, Xiaoyi Wei, Jian Liu","doi":"10.1093/nar/gkae986","DOIUrl":"https://doi.org/10.1093/nar/gkae986","url":null,"abstract":"<p><p>Paired single-cell and spatially resolved transcriptomics (SRT) data supplement each other, providing in-depth insights into biological processes and disease mechanisms. Previous SRT databases have limitations in curating sufficient single-cell and SRT pairs (SC-SP pairs) and providing real-time heuristic analysis, which hinder the effort to uncover potential biological insights. Here, we developed Pairpot (http://pairpot.bioxai.cn), a database tailored for paired single-cell and SRT data with real-time heuristic analysis. Pairpot curates 99 high-quality pairs including 1,425,656 spots from 299 datasets, and creates the association networks. It constructs the curated pairs by integrating multiple slices and establishing potential associations between single-cell and SRT data. On this basis, Pairpot adopts semi-supervised learning that enables real-time heuristic analysis for SC-SP pairs where Lasso-View refines the user-selected SRT domains within milliseconds, Pair-View infers cell proportions of spots based on user-selected cell types in real-time and Layer-View displays SRT slices using a 3D hierarchical layout. Experiments demonstrated Pairpot's efficiency in identifying heterogeneous domains and cell proportions.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142569224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"scImmOmics: a manually curated resource of single-cell multi-omics immune data.","authors":"Yan-Yu Li, Li-Wei Zhou, Feng-Cui Qian, Qiao-Li Fang, Zheng-Min Yu, Ting Cui, Fu-Juan Dong, Fu-Hong Cai, Ting-Ting Yu, Li-Dong Li, Qiu-Yu Wang, Yan-Bing Zhu, Hui-Fang Tang, Bao-Yang Hu, Chun-Quan Li","doi":"10.1093/nar/gkae985","DOIUrl":"https://doi.org/10.1093/nar/gkae985","url":null,"abstract":"<p><p>Single-cell sequencing technology has enabled the discovery and characterization of subpopulations of immune cells with unique functions, which is critical for revealing immune responses under healthy or disease conditions. Efforts have been made to collect and curate single-cell RNA sequencing (scRNA-seq) data, yet an immune-specific single-cell multi-omics atlas with harmonized metadata is still lacking. Here, we present scImmOmics (https://bio.liclab.net/scImmOmics/home), a manually curated single-cell multi-omics immune database constructed based on high-quality immune cells with known immune cell labels. Currently, scImmOmics documents >2.9 million cell-type labeled immune cells derived from seven single-cell sequencing technologies, involving 131 immune cell types, 47 tissues and 4 species. To ensure data consistency, we standardized the nomenclature of immune cell types and presented them in a hierarchical tree structure to clearly describe the lineage relationships within the immune system. scImmOmics also provides comprehensive immune regulatory information, including T-cell/B-cell receptor sequencing clonotype information, cell-specific regulatory information (e.g. gene/chromatin accessibility/protein/transcription factor states within known cell types, cell-to-cell communication and co-expression networks) and immune cell responses to cytokines. Collectively, scImmOmics is a comprehensive and valuable platform for unraveling the heterogeneity and diversity of immune cells and elucidating the specific regulatory mechanisms at the single-cell level.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142569230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oshadhi T Jayasinghe, Laura E Ritchey, Thomas Breil, Paxton Newman, Helen Yakhnin, Paul Babitzke
{"title":"NusG-dependent RNA polymerase pausing is a common feature of riboswitch regulatory mechanisms.","authors":"Oshadhi T Jayasinghe, Laura E Ritchey, Thomas Breil, Paxton Newman, Helen Yakhnin, Paul Babitzke","doi":"10.1093/nar/gkae981","DOIUrl":"https://doi.org/10.1093/nar/gkae981","url":null,"abstract":"<p><p>Transcription by RNA polymerase is punctuated by transient pausing events. Pausing provides time for RNA folding and binding of regulatory factors to the paused elongation complex. We previously identified 1600 NusG-dependent pauses throughout the Bacillus subtilis genome, with ∼20% localized to 5' leader regions, suggesting a regulatory role for these pauses. We examined pauses associated with known riboswitches to determine whether pausing is a common feature of these mechanisms. NusG-dependent pauses in the fmnP, tenA, mgtE, lysP and mtnK riboswitches were in strategic positions preceding the critical decision between the formation of alternative antiterminator or terminator structures, which is a critical feature of transcription attenuation mechanisms. In vitro transcription assays demonstrated that pausing increased the frequency of termination in the presence of the cognate ligand. NusG-dependent pausing also reduced the ligand concentration required for efficient termination. In vivo expression studies with transcriptional fusions confirmed that NusG-dependent pausing is a critical component of each riboswitch mechanism. Our results indicate that pausing enables cells to sense a broader range of ligand concentrations for fine-tuning riboswitch attenuation mechanisms.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142569222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yasset Perez-Riverol, Chakradhar Bandla, Deepti J Kundu, Selvakumar Kamatchinathan, Jingwen Bai, Suresh Hewapathirana, Nithu Sara John, Ananth Prakash, Mathias Walzer, Shengbo Wang, Juan Antonio Vizcaíno
{"title":"The PRIDE database at 20 years: 2025 update.","authors":"Yasset Perez-Riverol, Chakradhar Bandla, Deepti J Kundu, Selvakumar Kamatchinathan, Jingwen Bai, Suresh Hewapathirana, Nithu Sara John, Ananth Prakash, Mathias Walzer, Shengbo Wang, Juan Antonio Vizcaíno","doi":"10.1093/nar/gkae1011","DOIUrl":"https://doi.org/10.1093/nar/gkae1011","url":null,"abstract":"<p><p>The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world's leading mass spectrometry (MS)-based proteomics data repository and one of the founding members of the ProteomeXchange consortium. This manuscript summarizes the developments in PRIDE resources and related tools for the last three years. The number of submitted datasets to PRIDE Archive (the archival component of PRIDE) has reached on average around 534 datasets per month. This has been possible thanks to continuous improvements in infrastructure such as a new file transfer protocol for very large datasets (Globus), a new data resubmission pipeline and an automatic dataset validation process. Additionally, we will highlight novel activities such as the availability of the PRIDE chatbot (based on the use of open-source Large Language Models), and our work to improve support for MS crosslinking datasets. Furthermore, we will describe how we have increased our efforts to reuse, reanalyze and disseminate high-quality proteomics data into added-value resources such as UniProt, Ensembl and Expression Atlas.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142569241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}