Maodi Liang, Chenhao Zhang, Yang Yang, Qinghua Cui, Jun Zhang, Chunmei Cui
{"title":"TransmiR v3.0: an updated transcription factor-microRNA regulation database","authors":"Maodi Liang, Chenhao Zhang, Yang Yang, Qinghua Cui, Jun Zhang, Chunmei Cui","doi":"10.1093/nar/gkae1081","DOIUrl":"https://doi.org/10.1093/nar/gkae1081","url":null,"abstract":"microRNAs (miRNAs) are active in various biological processes by mediating gene expression, and the full investigation of miRNA transcription is crucial for understanding the mechanisms underlying miRNA deregulation in pathological conditions. Here an updated TransmiR v3.0 database is presented with more comprehensive miRNA transcription regulation information, which contains 5095 transcription factor (TF) -miRNA regulations curated from 2285 papers and >6 million TF–miRNA regulations derived from ChIP-seq data. Currently, TransmiR v3.0 covers 3260 TFs, 4253 miRNAs and 514 433 TF–miRNA regulation pairs across 29 organisms. Additionally, motif scanning of TF loci on promoter sequences of miRNAs from multiple species is employed to predict TF–miRNA regulations, generating 284 527 predicted TF–miRNA regulations. Besides the significant growth of data volume, we also improve the annotations for TFs and miRNAs by introducing the TF family, TFBS motif, and expression profiles for several species. Moreover, the functionality of the TransmiR v3.0 online database is enhanced, including allowing batch search for flexible queries and offering more extensive disease-specific, as well as newly sex-specific TF–miRNA regulation networks in the ‘Network’ module. TransmiR v3.0 provides a useful resource for studying miRNA biogenesis regulation and can be freely accessed at http://www.cuilab.cn/transmir.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"34 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The evolution of dbSNP: 25 years of impact in genomic research","authors":"Lon Phan, Hua Zhang, Qiang Wang, Ricardo Villamarin, Tim Hefferon, Aravinthan Ramanathan, Brandi Kattman","doi":"10.1093/nar/gkae977","DOIUrl":"https://doi.org/10.1093/nar/gkae977","url":null,"abstract":"The Single Nucleotide Polymorphism Database (dbSNP), established in 1998 by the National Center for Biotechnology Information (NCBI), has been a critical resource in genomics for cataloging small genetic variations. Originally focused on single nucleotide polymorphisms (SNPs), dbSNP has since expanded to include a variety of genetic variants, playing a key role in genome-wide association studies (GWAS), population genetics, pharmacogenomics, and cancer research. Over 25 years, dbSNP has grown to include more than 4.4 billion submitted SNPs and 1.1 billion unique reference SNPs, providing essential data for identifying disease-related genetic variants and studying human diversity. Integrating large-scale projects like 1000 Genomes, gnomAD, TOPMed, and ALFA has expanded dbSNP’s catalog of human genetic variation, increasing its usefulness for research and clinical applications. Keeping up with advancements such as next-generation sequencing and cloud-based infrastructure, dbSNP remains a cornerstone of genetic research supporting continued discoveries in precision medicine and population genomics. DATABASE URL: https://www.ncbi.nlm.nih.gov/snp.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"19 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Youngbin Moon, Christina J Herrmann, Aleksei Mironov, Mihaela Zavolan
{"title":"PolyASite v3.0: a multi-species atlas of polyadenylation sites inferred from single-cell RNA-sequencing data.","authors":"Youngbin Moon, Christina J Herrmann, Aleksei Mironov, Mihaela Zavolan","doi":"10.1093/nar/gkae1043","DOIUrl":"https://doi.org/10.1093/nar/gkae1043","url":null,"abstract":"<p><p>The broadly used 10X Genomics technology for single-cell RNA sequencing (scRNA-seq) captures RNA 3' ends. Thus, some reads contain part of the non-templated polyadenosine tails, providing direct evidence for the sites of 3' end cleavage and polyadenylation on the respective RNAs. Taking advantage of this property, we recently developed the SCINPAS workflow to infer polyadenylation sites (PASs) from scRNA-seq data. Here, we used this workflow to construct version 3.0 (v3.0, https://polyasite.unibas.ch/) of the PolyASite Atlas from a big compendium of publicly available human, mouse and worm scRNA-seq datasets obtained from healthy tissues. As the resolution of scRNA-seq was too low for robust detection of cell-level differences in PAS usage, we aggregated samples based on their tissue-of-origin to construct tissue-level catalogs of PASs. These provide qualitatively new information about PAS usage, in comparison to the previous PAS catalogs that were based on bulk 3' end sequencing experiments primarily in cell lines. In the new version, we document stringency levels associated with each PAS so that users can balance sensitivity and specificity in their analysis. We also upgraded the integration with the UCSC Genome Browser and developed track hubs conveniently displaying pooled and tissue-specific expression of PASs.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142624782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicholas Marzano, Brady Johnston, Bishnu P Paudel, Jason Schmidberger, Slobodan Jergic, Till Böcking, Mark Agostino, Ian Small, Antoine M van Oijen, Charles S Bond
{"title":"Single-molecule visualization of sequence-specific RNA binding by a designer PPR protein","authors":"Nicholas Marzano, Brady Johnston, Bishnu P Paudel, Jason Schmidberger, Slobodan Jergic, Till Böcking, Mark Agostino, Ian Small, Antoine M van Oijen, Charles S Bond","doi":"10.1093/nar/gkae984","DOIUrl":"https://doi.org/10.1093/nar/gkae984","url":null,"abstract":"Pentatricopeptide repeat proteins (PPR) are a large family of modular RNA-binding proteins, whereby each module can be modified to bind to a specific ssRNA nucleobase. As such, there is interest in developing ‘designer’ PPRs (dPPRs) for a range of biotechnology applications, including diagnostics or in vivo localization of ssRNA species; however, the mechanistic details regarding how PPRs search for and bind to target sequences is unclear. To address this, we determined the structure of a dPPR bound to its target sequence and used two- and three-color single-molecule fluorescence resonance energy transfer to interrogate the mechanism of ssRNA binding to individual dPPRs in real time. We demonstrate that dPPRs are slower to bind longer ssRNA sequences (or could not bind at all) and that this is, in part, due to their propensity to form stable secondary structures that sequester the target sequence from dPPR. Importantly, dPPR binds only to its target sequence (i.e. it does not associate with non-target ssRNA sequences) and does not ‘scan’ longer ssRNA oligonucleotides for the target sequence. The kinetic constraints imposed by random 3D diffusion may explain the long-standing conundrum of why PPR proteins are abundant in organelles, but almost unknown outside them (i.e. in the cytosol and nucleus).","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"72 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sarah D P Wilhelm, Jenica H Kakadia, Aruun Beharry, Rosan Kenana, Kyle S Hoffman, Patrick O’Donoghue, Ilka U Heinemann
{"title":"Transfer RNA supplementation rescues HARS deficiency in a humanized yeast model of Charcot-Marie-Tooth disease","authors":"Sarah D P Wilhelm, Jenica H Kakadia, Aruun Beharry, Rosan Kenana, Kyle S Hoffman, Patrick O’Donoghue, Ilka U Heinemann","doi":"10.1093/nar/gkae996","DOIUrl":"https://doi.org/10.1093/nar/gkae996","url":null,"abstract":"Aminoacyl-tRNA synthetases are indispensable enzymes in all cells, ensuring the correct pairing of amino acids to their cognate tRNAs to maintain translation fidelity. Autosomal dominant mutations V133F and Y330C in histidyl-tRNA synthetase (HARS) cause the genetic disorder Charcot-Marie-Tooth type 2W (CMT2W). Treatments are currently restricted to symptom relief, with no therapeutic available that targets the cause of disease. We previously found that histidine supplementation alleviated phenotypic defects in a humanized yeast model of CMT2W caused by HARS V155G and S356N that also unexpectedly exacerbated the phenotype of the two HARS mutants V133F and Y330C. Here, we show that V133F destabilizes recombinant HARS protein, which is rescued in the presence of tRNAHis. HARS V133F and Y330C cause mistranslation and cause changes to the proteome without activating the integrated stress response as validated by mass spectrometry and growth defects that persist with histidine supplementation. The growth defects and reduced translation fidelity caused by V133F and Y330C mutants were rescued by supplementation with human tRNAHis in a humanized yeast model. Our results demonstrate the feasibility of cognate tRNA as a therapeutic that rescues HARS deficiency and ameliorates toxic mistranslation generated by causative alleles for CMT.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"95 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Duzdevich, Christopher E Carr, Ben W F Colville, Harry R M Aitken, Jack W Szostak
{"title":"Overcoming nucleotide bias in the nonenzymatic copying of RNA templates","authors":"Daniel Duzdevich, Christopher E Carr, Ben W F Colville, Harry R M Aitken, Jack W Szostak","doi":"10.1093/nar/gkae982","DOIUrl":"https://doi.org/10.1093/nar/gkae982","url":null,"abstract":"The RNA World hypothesis posits that RNA was the molecule of both heredity and function during the emergence of life. This hypothesis implies that RNA templates can be copied, and ultimately replicated, without the catalytic aid of evolved enzymes. A major problem with nonenzymatic template-directed polymerization has been the very poor copying of sequences containing rA and rU. Here, we overcome that problem by using a prebiotically plausible mixture of RNA mononucleotides and random-sequence oligonucleotides, all activated by methyl isocyanide chemistry, that direct the uniform copying of arbitrary-sequence templates, including those harboring rA and rU. We further show that the use of this mixture in copying reactions suppresses copying errors while also generating a more uniform distribution of mismatches than observed for simpler systems. We find that oligonucleotide competition for template binding sites, oligonucleotide ligation and the template binding properties of reactant intermediates work together to reduce product sequence bias and errors. Finally, we show that iterative cycling of templated polymerization and activation chemistry improves the yields of random-sequence products. These results for random-sequence template copying are a significant advance in the pursuit of nonenzymatic RNA replication.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"156 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142601054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TranscriptDB: a transcript-centric database to study eukaryotic transcript conservation and evolution.","authors":"Wend Yam D D Ouedraogo, Aida Ouangraoua","doi":"10.1093/nar/gkae995","DOIUrl":"https://doi.org/10.1093/nar/gkae995","url":null,"abstract":"<p><p>Eukaryotic genes can encode multiple distinct transcripts through the alternative splicing (AS) of genes. Interest in the AS mechanism and its evolution across different species has stimulated numerous studies, leading to several databases that provide information on AS and transcriptome data across multiple eukaryotic species. However, existing resources do not offer information on transcript conservation and evolution between genes of multiple species. Similarly to genes, identifying conserved transcripts-those from homologous genes that have retained a similar exon composition-is useful for determining transcript homology relationships, studying transcript functions and reconstructing transcript phylogenies. To address this gap, we have developed TranscriptDB, a database dedicated to studying the conservation and evolution of transcripts within gene families. TranscriptDB offers an extensive catalog of conserved transcripts and phylogenies for 317 annotated eukaryotic species, sourced from Ensembl database version 111. It serves multiple purposes, including the exploration of gene and transcript evolution. Users can access TranscriptDB through various browsing and querying tools, including a user-friendly web interface. The incorporated web servers enable users to retrieve information on transcript evolution using their own data as input. Additionally, a REST application programming interface is available for programmatic data retrieval. A data directory is also available for bulk downloads. TranscriptDB and its resources are freely accessible at https://transcriptdb.cobius.usherbrooke.ca.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142624793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CRISPRepi: a multi-omic atlas for CRISPR-based epigenome editing","authors":"Leisheng Shi, Shasha Li, Rongyi Zhu, Chenyang Lu, Xintian Xu, Changzhi Li, Xinyue Huang, Xiaolu Zhao, Fengbiao Mao, Kailong Li","doi":"10.1093/nar/gkae1039","DOIUrl":"https://doi.org/10.1093/nar/gkae1039","url":null,"abstract":"CRISPR-based epigenome editing integrates the precision of CRISPR with the capability of epigenetic mark rewriting, offering a tunable and reversible gene regulation strategy without altering the DNA sequences. Various epigenome editing systems have been developed and applied in different organisms and cell types; however, the detailed information is discrete, making it challenging to evaluate the precision of different editing systems and design the optimal sgRNAs for further functional studies. Herein, we developed CRISPRepi (http://crisprepi.maolab.org/ or http://crisprepi.lilab-pkuhsc.org/), a pioneering platform that consolidates extensive sequencing data from 671 meticulously curated RNA-seq, ChIP-seq, Bisulfite-seq and ATAC-seq datasets in 87 cell types manipulated by 74 epigenome editing systems. In total, we have curated 5962 sgRNAs associated with 283 target genes from 2277 samples across six species. CRISPRepi incorporates tools for analyzing editing outcomes and assessing off-target effects by analyzing gene expression changes pre- and post-editing, along with the details of multi-omic epigenetic landscapes. Moreover, CRISPRepi supports the investigation of editing potentials for newly designed sgRNA sequences in a cell/tissue-specific context. By providing a user-friendly interface for searching and selecting optimal editing designs across multiple organisms, CRISPRepi serves as an integrated resource for researchers to evaluate editing efficiency and off-target effects among diverse CRISPR-based epigenome editing systems.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"25 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuntao Yang, Himansu Kumar, Yuhan Xie, Zhao Li, Rongbin Li, Wenbo Chen, Chiamaka S Diala, Meer A Ali, Yi Xu, Albon Wu, Sayed-Rzgar Hosseini, Erfei Bi, Hongyu Zhao, Pora Kim, W Jim Zheng
{"title":"ASpdb: an integrative knowledgebase of human protein isoforms from experimental and AI-predicted structures","authors":"Yuntao Yang, Himansu Kumar, Yuhan Xie, Zhao Li, Rongbin Li, Wenbo Chen, Chiamaka S Diala, Meer A Ali, Yi Xu, Albon Wu, Sayed-Rzgar Hosseini, Erfei Bi, Hongyu Zhao, Pora Kim, W Jim Zheng","doi":"10.1093/nar/gkae1018","DOIUrl":"https://doi.org/10.1093/nar/gkae1018","url":null,"abstract":"Alternative splicing is a crucial cellular process in eukaryotes, enabling the generation of multiple protein isoforms with diverse functions from a single gene. To better understand the impact of alternative splicing on protein structures, protein–protein interaction and human diseases, we developed ASpdb (https://biodataai.uth.edu/ASpdb/), a comprehensive database integrating experimentally determined structures and AlphaFold 2-predicted models for human protein isoforms. ASpdb includes over 3400 canonical isoforms, each represented by both experimentally resolved and predicted structures, and &gt;7200 alternative isoforms with AlphaFold 2 predictions. In addition to detailed splicing events, 3D structures, sequence variations and functional annotations, ASpdb uniquely offers comparative analyses and visualization of structural alterations among isoforms. This resource is invaluable for advancing research in alternative splicing, structural biology and disease mechanisms.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"11 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142601437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maria Cerezo, Elliot Sollis, Yue Ji, Elizabeth Lewis, Ala Abid, Karatuğ Ozan Bircan, Peggy Hall, James Hayhurst, Sajo John, Abayomi Mosaku, Santhi Ramachandran, Amy Foreman, Arwa Ibrahim, James McLaughlin, Zoë Pendlington, Ray Stefancsik, Samuel A Lambert, Aoife McMahon, Joannella Morales, Thomas Keane, Michael Inouye, Helen Parkinson, Laura W Harris
{"title":"The NHGRI-EBI GWAS Catalog: standards for reusability, sustainability and diversity","authors":"Maria Cerezo, Elliot Sollis, Yue Ji, Elizabeth Lewis, Ala Abid, Karatuğ Ozan Bircan, Peggy Hall, James Hayhurst, Sajo John, Abayomi Mosaku, Santhi Ramachandran, Amy Foreman, Arwa Ibrahim, James McLaughlin, Zoë Pendlington, Ray Stefancsik, Samuel A Lambert, Aoife McMahon, Joannella Morales, Thomas Keane, Michael Inouye, Helen Parkinson, Laura W Harris","doi":"10.1093/nar/gkae1070","DOIUrl":"https://doi.org/10.1093/nar/gkae1070","url":null,"abstract":"The NHGRI-EBI GWAS Catalog serves as a vital resource for the genetic research community, providing access to the most comprehensive database of human GWAS results. Currently, it contains close to 7 000 publications for &gt;15 000 traits, from which more than 625 000 lead associations have been curated. Additionally, 85 000 full genome-wide summary statistics datasets—containing association data for all variants in the analysis—are available for downstream analyses such as meta-analysis, fine-mapping, Mendelian randomisation or development of polygenic risk scores. As a centralised repository for GWAS results, the GWAS Catalog sets and implements standards for data submission and harmonisation, and encourages the use of consistent descriptors for traits, samples and methodologies. We share processes and vocabulary with the PGS Catalog, improving interoperability for a growing user group. Here, we describe the latest changes in data content, improvements in our user interface, and the implementation of the GWAS-SSF standard format for summary statistics. We address the challenges of handling the rapid increase in large-scale molecular quantitative trait GWAS and the need for sensitivity in the use of population and cohort descriptors while maintaining data interoperability and reusability.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"529 1","pages":""},"PeriodicalIF":14.9,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}