NAR Genomics and Bioinformatics最新文献

筛选
英文 中文
LoVis4u: a locus visualization tool for comparative genomics and coverage profiles. LoVis4u:用于比较基因组学和覆盖概况的基因座可视化工具。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2025-02-24 eCollection Date: 2025-03-01 DOI: 10.1093/nargab/lqaf009
Artyom A Egorov, Gemma C Atkinson
{"title":"LoVis4u: a locus visualization tool for comparative genomics and coverage profiles.","authors":"Artyom A Egorov, Gemma C Atkinson","doi":"10.1093/nargab/lqaf009","DOIUrl":"10.1093/nargab/lqaf009","url":null,"abstract":"<p><p>Comparative genomic analysis often involves visualization of alignments of genomic loci. While several software tools are available for this task, ranging from Python and R libraries to stand-alone graphical user interfaces, a tool is lacking that offers fast, automated usage and the production of publication-ready vector images. Here we present LoVis4u, a command-line tool and Python API designed for highly customizable and fast visualization of multiple genomic loci. LoVis4u generates vector images in PDF format based on annotation data from GenBank or GFF files. It is capable of visualizing entire genomes of bacteriophages as well as plasmids and user-defined regions of longer prokaryotic genomes. Additionally, LoVis4u offers optional data processing steps to identify and highlight accessory and core genes in input sequences. Finally, LoVis4u supports the visualization of genomic signal track profiles from sequencing experiments. LoVis4u is implemented in Python3 and runs on Linux and MacOS. The command-line interface covers most practical use cases, while the provided Python API allows usage within a Python program, integration into external tools, and additional customization. The source code is available at the GitHub page: github.com/art-egorov/lovis4u. Detailed documentation that includes an example-driven guide is available from the software home page: art-egorov.github.io/lovis4u.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqaf009"},"PeriodicalIF":4.0,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11850299/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Halfpipe: a tool for analyzing metabolic labeling RNA-seq data to quantify RNA half-lives. Halfpipe:用于分析代谢标记RNA-seq数据以量化RNA半衰期的工具。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2025-02-18 eCollection Date: 2025-03-01 DOI: 10.1093/nargab/lqaf006
Jason M Müller, Elisabeth Altendorfer, Susanne Freier, Katharina Moos, Andreas Mayer, Achim Tresch
{"title":"Halfpipe: a tool for analyzing metabolic labeling RNA-seq data to quantify RNA half-lives.","authors":"Jason M Müller, Elisabeth Altendorfer, Susanne Freier, Katharina Moos, Andreas Mayer, Achim Tresch","doi":"10.1093/nargab/lqaf006","DOIUrl":"10.1093/nargab/lqaf006","url":null,"abstract":"<p><p>We introduce Halfpipe, a tool for analyzing RNA-seq data from metabolic RNA labeling experiments. Its main features are the absolute quantification of 4-thiouridine-labeling-induced T>C conversions in the data as generated by SLAM-seq, calculating the proportion of newly synthesized transcripts, and estimating subcellular RNA half-lives. Halfpipe excels at correcting critical biases caused by typically low labeling efficiency. We measure and compare the RNA metabolism in the G1 phase and during the mitosis of synchronized human cells. We find that RNA half-lives of constantly expressed RNAs are similar in mitosis and G1 phase, suggesting that RNA stability of those genes is constant throughout the cell cycle. Our estimates correlate well with literature values and with known RNA sequence features. Halfpipe is freely available at https://github.com/IMSBCompBio/Halfpipe.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqaf006"},"PeriodicalIF":4.0,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11833738/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143450434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Current state and future prospects of Horizontal Gene Transfer detection. 水平基因转移检测的现状与展望。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2025-02-11 eCollection Date: 2025-03-01 DOI: 10.1093/nargab/lqaf005
Andre Jatmiko Wijaya, Aleksandar Anžel, Hugues Richard, Georges Hattab
{"title":"Current state and future prospects of Horizontal Gene Transfer detection.","authors":"Andre Jatmiko Wijaya, Aleksandar Anžel, Hugues Richard, Georges Hattab","doi":"10.1093/nargab/lqaf005","DOIUrl":"10.1093/nargab/lqaf005","url":null,"abstract":"<p><p>Artificial intelligence (AI) has been shown to be beneficial in a wide range of bioinformatics applications. Horizontal Gene Transfer (HGT) is a driving force of evolutionary changes in prokaryotes. It is widely recognized that it contributes to the emergence of antimicrobial resistance (AMR), which poses a particularly serious threat to public health. Many computational approaches have been developed to study and detect HGT. However, the application of AI in this field has not been investigated. In this work, we conducted a review to provide information on the current trend of existing computational approaches for detecting HGT and to decipher the use of AI in this field. Here, we show a growing interest in HGT detection, characterized by a surge in the number of computational approaches, including AI-based approaches, in recent years. We organize existing computational approaches into a hierarchical structure of computational groups based on their computational methods and show how each computational group evolved. We make recommendations and discuss the challenges of HGT detection in general and the adoption of AI in particular. Moreover, we provide future directions for the field of HGT detection.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqaf005"},"PeriodicalIF":4.0,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11811736/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143399361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing personalized cancer therapy: Onko_DrugCombScreen-a novel Shiny app for precision drug combination screening. 推进个性化癌症治疗:onko_drugcombscreen——一款用于精确药物组合筛选的新型闪亮应用程序。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2025-01-31 eCollection Date: 2025-03-01 DOI: 10.1093/nargab/lqaf004
Jingyu Yang, Meng Wang, Jürgen Dönitz, Björn Chapuy, Tim Beißbarth
{"title":"Advancing personalized cancer therapy: Onko_DrugCombScreen-a novel Shiny app for precision drug combination screening.","authors":"Jingyu Yang, Meng Wang, Jürgen Dönitz, Björn Chapuy, Tim Beißbarth","doi":"10.1093/nargab/lqaf004","DOIUrl":"10.1093/nargab/lqaf004","url":null,"abstract":"<p><p>Identifying and validating genotype-guided drug combinations for a specific molecular subtype in cancer therapy represents an unmet medical need and is important in enhancing efficacy and reducing toxicity. However, the exponential increase in combinatorial possibilities constrains the ability to identify and validate effective drug combinations. In this context, we have developed Onko_DrugCombScreen, an innovative tool aiming at advancing precision medicine based on identifying significant drug combination candidates in a target cancer cohort compared to a comparison cohort. Onko_DrugCombScreen, inspired by the molecular tumor board process, synergizes drug knowledgebase analysis with various statistical methodologies and data visualization techniques to pinpoint drug combination candidates. Validated through a TCGA-BRCA case study, Onko_DrugCombScreen has demonstrated its proficiency in discerning established drug combinations in a specific cancer type and in revealing potential novel drug combinations. By enhancing the capability of drug combination discovery through drug knowledgebases, Onko_DrugCombScreen represents a significant advancement in personalized cancer treatment by identifying promising drug combinations, setting the stage for the development of more precise and potent combination treatments in cancer care. The Onko_DrugCombScreen Shiny app is available at https://rshiny.gwdg.de/apps/onko_drugcombscreen/. The Git repository can be accessed at https://gitlab.gwdg.de/MedBioinf/mtb/onko_drugcombscreen.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqaf004"},"PeriodicalIF":4.0,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783568/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GEMCAT-a new algorithm for gene expression-based prediction of metabolic alterations. 基于基因表达预测代谢改变的新算法。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2025-01-31 eCollection Date: 2025-03-01 DOI: 10.1093/nargab/lqaf003
Suraj Sharma, Roland Sauter, Madlen Hotze, Aaron Marcellus Paul Prowatke, Marc Niere, Tobias Kipura, Anna-Sophia Egger, Kathrin Thedieck, Marcel Kwiatkowski, Mathias Ziegler, Ines Heiland
{"title":"GEMCAT-a new algorithm for gene expression-based prediction of metabolic alterations.","authors":"Suraj Sharma, Roland Sauter, Madlen Hotze, Aaron Marcellus Paul Prowatke, Marc Niere, Tobias Kipura, Anna-Sophia Egger, Kathrin Thedieck, Marcel Kwiatkowski, Mathias Ziegler, Ines Heiland","doi":"10.1093/nargab/lqaf003","DOIUrl":"10.1093/nargab/lqaf003","url":null,"abstract":"<p><p>The interpretation of multi-omics datasets obtained from high-throughput approaches is important to understand disease-related physiological changes and to predict biomarkers in body fluids. We present a new metabolite-centred genome-scale metabolic modelling algorithm, the Gene Expression-based Metabolite Centrality Analysis Tool (GEMCAT). GEMCAT enables integration of transcriptomics or proteomics data to predict changes in metabolite concentrations, which can be verified by targeted metabolomics. In addition, GEMCAT allows to trace measured and predicted metabolic changes back to the underlying alterations in gene expression or proteomics and thus enables functional interpretation and integration of multi-omics data. We demonstrate the predictive capacity of GEMCAT on three datasets and genome-scale metabolic networks from two different organisms: (i) we integrated transcriptomics and metabolomics data from an engineered human cell line with a functional deletion of the mitochondrial NAD transporter; (ii) we used a large multi-tissue multi-omics dataset from rats for transcriptome- and proteome-based prediction and verification of training-induced metabolic changes and achieved an average prediction accuracy of 70%; and (iii) we used proteomics measurements from patients with inflammatory bowel disease and verified the predicted changes using metabolomics data from the same patients. For this dataset, the prediction accuracy achieved by GEMCAT was 79%.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqaf003"},"PeriodicalIF":4.0,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783570/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Consistent features observed in structural probing data of eukaryotic RNAs. 真核生物rna结构探测数据一致。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2025-01-30 eCollection Date: 2025-03-01 DOI: 10.1093/nargab/lqaf001
Kazuteru Yamamura, Kiyoshi Asai, Junichi Iwakiri
{"title":"Consistent features observed in structural probing data of eukaryotic RNAs.","authors":"Kazuteru Yamamura, Kiyoshi Asai, Junichi Iwakiri","doi":"10.1093/nargab/lqaf001","DOIUrl":"10.1093/nargab/lqaf001","url":null,"abstract":"<p><p>Understanding RNA structure is crucial for elucidating its regulatory mechanisms. With the recent commercialization of messenger RNA vaccines, the profound impact of RNA structure on stability and translation efficiency has become increasingly evident, underscoring the importance of understanding RNA structure. Chemical probing of RNA has emerged as a powerful technique for investigating RNA structure in living cells. This approach utilizes chemical probes that selectively react with accessible regions of RNA, and by measuring reactivity, the openness and potential of RNA for protein binding or base pairing can be inferred. Extensive experimental data generated using RNA chemical probing have significantly contributed to our understanding of RNA structure in cells. However, it is crucial to acknowledge potential biases in chemical probing data to ensure an accurate interpretation. In this study, we comprehensively analyzed transcriptome-scale RNA chemical probing data in eukaryotes and report common features. Notably, in all experiments, the number of bases modified in probing was small, the bases showing the top 10% reactivity well reflected the known secondary structure, bases with high reactivity were more likely to be exposed to solvent and low reactivity did not reflect solvent exposure, which is important information for the analysis of RNA chemical probing data.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqaf001"},"PeriodicalIF":4.0,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11780854/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143068391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
scHiGex: predicting single-cell gene expression based on single-cell Hi-C data. scHiGex:基于单细胞Hi-C数据预测单细胞基因表达。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2025-01-27 eCollection Date: 2025-03-01 DOI: 10.1093/nargab/lqaf002
Bishal Shrestha, Andrew Jordan Siciliano, Hao Zhu, Tong Liu, Zheng Wang
{"title":"scHiGex: predicting single-cell gene expression based on single-cell Hi-C data.","authors":"Bishal Shrestha, Andrew Jordan Siciliano, Hao Zhu, Tong Liu, Zheng Wang","doi":"10.1093/nargab/lqaf002","DOIUrl":"10.1093/nargab/lqaf002","url":null,"abstract":"<p><p>A novel biochemistry experiment named HiRES has been developed to capture both the chromosomal conformations and gene expression levels of individual single cells simultaneously. Nevertheless, when compared to the extensive volume of single-cell Hi-C data generated from individual cells, the number of datasets produced from this experiment remains limited in the scientific community. Hence, there is a requirement for a computational tool that can forecast the levels of gene expression in individual cells using single-cell Hi-C data from the same cells. We trained a graph transformer called scHiGex that accurately and effectively predicts gene expression levels based on single-cell Hi-C data. We conducted a benchmark of scHiGex that demonstrated notable performance on the predictions with an average absolute error of 0.07. Furthermore, the predicted levels of gene expression led to precise categorizations (adjusted Rand index score 1) of cells into distinct cell types, demonstrating that our model effectively captured the heterogeneity between individual cell types. scHiGex is freely available at https://github.com/zwang-bioinformatics/scHiGex.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqaf002"},"PeriodicalIF":4.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11770341/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143053403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
stana: an R package for metagenotyping analysis and interactive application based on clinical data. stana:一个基于临床数据的元基因型分析和交互式应用程序的R包。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2025-01-08 eCollection Date: 2025-03-01 DOI: 10.1093/nargab/lqae191
Noriaki Sato, Kotoe Katayama, Daichi Miyaoka, Miho Uematsu, Ayumu Saito, Kosuke Fujimoto, Satoshi Uematsu, Seiya Imoto
{"title":"stana: an R package for metagenotyping analysis and interactive application based on clinical data.","authors":"Noriaki Sato, Kotoe Katayama, Daichi Miyaoka, Miho Uematsu, Ayumu Saito, Kosuke Fujimoto, Satoshi Uematsu, Seiya Imoto","doi":"10.1093/nargab/lqae191","DOIUrl":"10.1093/nargab/lqae191","url":null,"abstract":"<p><p>Metagenotyping of metagenomic data has recently attracted increasing attention as it resolves intraspecies diversity by identifying single nucleotide variants. Furthermore, gene copy number analysis within species provides a deeper understanding of metabolic functions in microbial communities. However, a platform for examining metagenotyping results based on relevant grouping data is lacking. Here, we have developed the R package, stana, for the processing and analysis of metagenotyping results. The package consists of modules for preprocessing, statistical analysis, functional analysis and visualization. An interactive analysis environment for exploring the metagenotyping results was also developed and publicly released with over 1000 publicly available metagenome samples related to human diseases. Three examples exploring the relationship between the metagenotypes of the gut microbiome and human diseases are presented-end-stage renal disease, Crohn's disease and Parkinson's disease. The results suggest that stana facilitated the confirmation of the original study's findings and the generation of a new hypothesis. The GitHub repository for the package is available at https://github.com/noriakis/stana.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqae191"},"PeriodicalIF":4.0,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11707543/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142956066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Whole-genome automated assembly pipeline for Chlamydia trachomatis strains from reference, in vitro and clinical samples using the integrated CtGAP pipeline. 利用集成的CtGAP管道对参考、体外和临床样品沙眼衣原体菌株进行全基因组自动化组装流水线。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2025-01-07 eCollection Date: 2025-03-01 DOI: 10.1093/nargab/lqae187
Olusola Olagoke, Ammar Aziz, Lucile H Zhu, Timothy D Read, Deborah Dean
{"title":"Whole-genome automated assembly pipeline for <i>Chlamydia trachomatis</i> strains from reference, <i>in vitro</i> and clinical samples using the integrated CtGAP pipeline.","authors":"Olusola Olagoke, Ammar Aziz, Lucile H Zhu, Timothy D Read, Deborah Dean","doi":"10.1093/nargab/lqae187","DOIUrl":"10.1093/nargab/lqae187","url":null,"abstract":"<p><p>Whole genome sequencing (WGS) is pivotal for the molecular characterization of <i>Chlamydia trachomatis</i> (<i>Ct</i>)-the leading bacterial cause of sexually transmitted infections and infectious blindness worldwide. <i>Ct</i> WGS can inform epidemiologic, public health and outbreak investigations of these human-restricted pathogens. However, challenges persist in generating high-quality genomes for downstream analyses given its obligate intracellular nature and difficulty with <i>in vitro</i> propagation. No single tool exists for the entirety of <i>Ct</i> genome assembly, necessitating the adaptation of multiple programs with varying success. Compounding this issue is the absence of reliable <i>Ct</i> reference strain genomes. We, therefore, developed CtGAP-<i>Chlamydia trachomatis</i>Genome Assembly Pipeline-as an integrated 'one-stop-shop' pipeline for assembly and characterization of <i>Ct</i> genome sequencing data from various sources including isolates, <i>in vitro</i> samples, clinical swabs and urine. CtGAP, written in Snakemake, enables read quality statistics output, adapter and quality trimming, host read removal, <i>de novo</i> and reference-guided assembly, contig scaffolding, selective <i>omp</i>A, multi-locus-sequence and plasmid typing, phylogenetic tree construction, and recombinant genome identification. Twenty <i>Ct</i> reference genomes were also generated. Successfully validated on a diverse collection of 363 samples containing <i>Ct</i>, CtGAP represents a novel pipeline requiring minimal bioinformatics expertise with easy adaptation for use with other bacterial species.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqae187"},"PeriodicalIF":4.0,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11704784/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142956068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PSAURON: a tool for assessing protein annotation across a broad range of species. PSAURON:一种用于评估多种物种蛋白质注释的工具。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2025-01-07 eCollection Date: 2025-03-01 DOI: 10.1093/nargab/lqae189
Markus J Sommer, Aleksey V Zimin, Steven L Salzberg
{"title":"PSAURON: a tool for assessing protein annotation across a broad range of species.","authors":"Markus J Sommer, Aleksey V Zimin, Steven L Salzberg","doi":"10.1093/nargab/lqae189","DOIUrl":"10.1093/nargab/lqae189","url":null,"abstract":"<p><p>Evaluating the accuracy of protein-coding sequences in genome annotations is a challenging problem for which there is no broadly applicable solution. In this manuscript, we introduce PSAURON (Protein Sequence Assessment Using a Reference ORF Network), a novel software tool developed to help assess the quality of protein-coding gene annotations. Utilizing a machine learning model trained on a diverse dataset from over 1000 plant and animal genomes, PSAURON assigns a score to coding DNA or protein sequence that reflects the likelihood that the sequence is a genuine protein-coding region. PSAURON scores can be used for genome-wide protein annotation assessment as well as the rapid identification of potentially spurious annotated proteins. Validation against established benchmarks demonstrates PSAURON's effectiveness and correlation with recognized measures of protein quality, highlighting its potential use as a widely applicable method to evaluate precision in gene annotation. PSAURON is open source and freely available at https://github.com/salzberg-lab/PSAURON.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqae189"},"PeriodicalIF":4.0,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11704789/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142956063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信