NAR Genomics and Bioinformatics最新文献

筛选
英文 中文
Predicting the pro-longevity or anti-longevity effect of model organism genes with enhanced Gaussian noise augmentation-based contrastive learning on protein-protein interaction networks. 基于增强高斯噪声增强的蛋白质-蛋白质相互作用网络对比学习预测模式生物基因的促长寿或抗长寿效应。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2024-11-28 eCollection Date: 2024-12-01 DOI: 10.1093/nargab/lqae153
Ibrahim Alsaggaf, Alex A Freitas, Cen Wan
{"title":"Predicting the pro-longevity or anti-longevity effect of model organism genes with enhanced Gaussian noise augmentation-based contrastive learning on protein-protein interaction networks.","authors":"Ibrahim Alsaggaf, Alex A Freitas, Cen Wan","doi":"10.1093/nargab/lqae153","DOIUrl":"10.1093/nargab/lqae153","url":null,"abstract":"<p><p>Ageing is a highly complex and important biological process that plays major roles in many diseases. Therefore, it is essential to better understand the molecular mechanisms of ageing-related genes. In this work, we proposed a novel enhanced Gaussian noise augmentation-based contrastive learning (EGsCL) framework to predict the pro-longevity or anti-longevity effect of four model organisms' ageing-related genes by exploiting protein-protein interaction (PPI) networks. The experimental results suggest that EGsCL successfully outperformed the conventional Gaussian noise augmentation-based contrastive learning methods and obtained state-of-the-art performance on three model organisms' predictive tasks when merely relying on PPI network data. In addition, we use EGsCL to predict 10 novel pro-/anti-longevity mouse genes and discuss the support for these predictions in the literature.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae153"},"PeriodicalIF":4.0,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11616696/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative molecular dynamics calculations of duplexation of chemically modified analogs of DNA used for antisense applications. 用于反义应用的化学修饰的DNA类似物的重聚合的比较分子动力学计算。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2024-11-28 eCollection Date: 2024-12-01 DOI: 10.1093/nargab/lqae155
Rodrigo Galindo-Murillo, Jack S Cohen, Barak Akabayov
{"title":"Comparative molecular dynamics calculations of duplexation of chemically modified analogs of DNA used for antisense applications.","authors":"Rodrigo Galindo-Murillo, Jack S Cohen, Barak Akabayov","doi":"10.1093/nargab/lqae155","DOIUrl":"10.1093/nargab/lqae155","url":null,"abstract":"<p><p>We have subjected several analogs of DNA that have been widely used as antisense oligonucleotide (ASO) inhibitors of gene expression to comparative molecular dynamics (MD) calculations of their ability to form duplexes with DNA and RNA. The analogs included in this study are the phosphorothioate (PS), peptide nucleic acid (PNA), locked nucleic acid (LNA), morpholino nucleic acid (PMO), the 2'-OMe, 2'-F, 2'-methoxyethyl (2'-MOE) and the constrained cET analogs, as well as the natural phosphodiester (PO) as control, for a total of nine structures, in both XNA-DNA and XNA-RNA duplexes. This is intended as an objective criterion for their relative ability to duplex with an RNA complement and their comparative potential for antisense applications. We have found that the constrained furanose ring analogs show increased stability when considering this study's structural and energetic parameters. The 2'-MOE modification, even though energetically stable, has an elevated dynamic range and breathing properties due to the bulkier moiety in the C2' position of the furanose. The smaller modifications in the C2' position, 2'-F, 2'-OMe and PS also form stable and energetically favored duplexes with both DNA and RNA. The morpholino moiety allows for increased tolerance in accommodating either DNA or RNA and the PNA, with the PNA being the most energetically stable, although with a preference for the B-form DNA. In summary, we can rank the overall preference of hybrid strand formations as PNA > cET/LNA > PS/2'-F/2'-OMe > morpholino > 2'-MOE for the efficacy of duplex formation.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae155"},"PeriodicalIF":4.0,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11616695/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VILOCA: sequencing quality-aware viral haplotype reconstruction and mutation calling for short-read and long-read data. VILOCA:测序质量敏感的病毒单倍型重建和突变,需要短读和长读数据。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2024-11-28 eCollection Date: 2024-12-01 DOI: 10.1093/nargab/lqae152
Lara Fuhrmann, Benjamin Langer, Ivan Topolsky, Niko Beerenwinkel
{"title":"VILOCA: sequencing quality-aware viral haplotype reconstruction and mutation calling for short-read and long-read data.","authors":"Lara Fuhrmann, Benjamin Langer, Ivan Topolsky, Niko Beerenwinkel","doi":"10.1093/nargab/lqae152","DOIUrl":"10.1093/nargab/lqae152","url":null,"abstract":"<p><p>RNA viruses exist as large heterogeneous populations within their host. The structure and diversity of virus populations affects disease progression and treatment outcomes. Next-generation sequencing allows detailed viral population analysis, but inferring diversity from error-prone reads is challenging. Here, we present VILOCA (VIral LOcal haplotype reconstruction and mutation CAlling for short and long read data), a method for mutation calling and reconstruction of local haplotypes from short- and long-read viral sequencing data. Local haplotypes refer to genomic regions that have approximately the length of the input reads. VILOCA recovers local haplotypes by using a Dirichlet process mixture model to cluster reads around their unobserved haplotypes and leveraging quality scores of the sequencing reads. We assessed the performance of VILOCA in terms of mutation calling and haplotype reconstruction accuracy on simulated and experimental Illumina, PacBio and Oxford Nanopore data. On simulated and experimental Illumina data, VILOCA performed better or similar to existing methods. On the simulated long-read data, VILOCA is able to recover on average [Formula: see text] of the ground truth mutations with perfect precision compared to only [Formula: see text] recall and [Formula: see text] precision of the second-best method. In summary, VILOCA provides significantly improved accuracy in mutation and haplotype calling, especially for long-read sequencing data, and therefore facilitates the comprehensive characterization of heterogeneous within-host viral populations.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae152"},"PeriodicalIF":4.0,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11616694/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CancerPro: deciphering the pan-cancer prognostic landscape through combinatorial enrichment analysis and knowledge network insights. CancerPro:通过组合富集分析和知识网络洞察力解读泛癌症预后景观。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2024-11-21 eCollection Date: 2024-12-01 DOI: 10.1093/nargab/lqae157
Zhigang Wang, Yize Yuan, Zhe Wang, Wenjia Zhang, Chong Chen, Zhaojun Duan, Suyuan Peng, Jie Zheng, Yongqun He, Xiaolin Yang
{"title":"CancerPro: deciphering the pan-cancer prognostic landscape through combinatorial enrichment analysis and knowledge network insights.","authors":"Zhigang Wang, Yize Yuan, Zhe Wang, Wenjia Zhang, Chong Chen, Zhaojun Duan, Suyuan Peng, Jie Zheng, Yongqun He, Xiaolin Yang","doi":"10.1093/nargab/lqae157","DOIUrl":"10.1093/nargab/lqae157","url":null,"abstract":"<p><p>Gene expression levels serve as valuable markers for assessing prognosis in cancer patients. To understand the mechanisms underlying prognosis and explore potential therapeutics across diverse cancers, we developed CancerPro (https:/medcode.link/cancerpro). This knowledge network platform integrates comprehensive biomedical data on genes, drugs, diseases and pathways, along with their interactions. By integrating ontology and knowledge graph technologies, CancerPro offers a user-friendly interface for analyzing pan-cancer prognostic markers and exploring genes or drugs of interest. CancerPro implements three core functions: gene set enrichment analysis based on multiple annotations; in-depth drug analysis; and in-depth gene list analysis. Using CancerPro, we categorized genes and cancers into distinct groups and utilized network analysis to identify key biological pathways associated with unfavorable prognostic genes. The platform further pinpoints potential drug targets and explores potential links between prognostic markers and patient characteristics such as glutathione levels and obesity. For renal and prostate cancer, CancerPro identified risk genes linked to immune deficiency pathways and alternative splicing abnormalities. This research highlights CancerPro's potential as a valuable tool for researchers to explore pan-cancer prognostic markers and uncover novel therapeutic avenues. Its flexible tools support a wide range of biological investigations, making it a versatile asset in cancer research and beyond.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae157"},"PeriodicalIF":4.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11616677/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
There will always be variants of uncertain significance. Analysis of VUSs. 总会有意义不确定的变体。VUSs分析。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2024-11-21 eCollection Date: 2024-12-01 DOI: 10.1093/nargab/lqae154
Haoyang Zhang, Muhammad Kabir, Saeed Ahmed, Mauno Vihinen
{"title":"There will always be variants of uncertain significance. Analysis of VUSs.","authors":"Haoyang Zhang, Muhammad Kabir, Saeed Ahmed, Mauno Vihinen","doi":"10.1093/nargab/lqae154","DOIUrl":"10.1093/nargab/lqae154","url":null,"abstract":"<p><p>The ACMG/AMP guidelines include five categories of which variants of uncertain significance (VUSs) have received increasing attention. Recently, Fowler and Rehm claimed that all or most VUSs could be reclassified as pathogenic or benign within few years. To test this claim, we collected validated benign, pathogenic, VUS and conflicting variants from ClinVar and LOVD and investigated differences at gene, protein, structure, and variant levels. The gene and protein features included inheritance patterns, actionability, functional categories for housekeeping, essential, complete knockout, lethality and haploinsufficient proteins, Gene Ontology annotations, and protein network properties. Structural properties included the location at secondary structural elements, intrinsically disordered regions, transmembrane regions, repeats, conservation, and accessibility. Gene features were distributions of nucleotides, their groupings, codons, and location to CpG islands. The distributions of amino acids and their groups were investigated. VUSs did not markedly differ from other variants. The only major differences were the accessibility and conservation of pathogenic variants, and reduced ratio of repeat-locating variants in VUSs. Thus, all VUSs cannot be distinguished from other types of variants. They display one form of natural biological heterogeneity. Instead of concentrating on eradicating VUSs, the community would benefit from investigating and understanding factors that contribute to phenotypic heterogeneity.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae154"},"PeriodicalIF":4.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11616676/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bilingual language model for protein sequence and structure. 蛋白质序列和结构的双语语言模型。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2024-11-15 eCollection Date: 2024-12-01 DOI: 10.1093/nargab/lqae150
Michael Heinzinger, Konstantin Weissenow, Joaquin Gomez Sanchez, Adrian Henkel, Milot Mirdita, Martin Steinegger, Burkhard Rost
{"title":"Bilingual language model for protein sequence and structure.","authors":"Michael Heinzinger, Konstantin Weissenow, Joaquin Gomez Sanchez, Adrian Henkel, Milot Mirdita, Martin Steinegger, Burkhard Rost","doi":"10.1093/nargab/lqae150","DOIUrl":"10.1093/nargab/lqae150","url":null,"abstract":"<p><p>Adapting language models to protein sequences spawned the development of powerful protein language models (pLMs). Concurrently, AlphaFold2 broke through in protein structure prediction. Now we can systematically and comprehensively explore the dual nature of proteins that act and exist as three-dimensional (3D) machines and evolve as linear strings of one-dimensional (1D) sequences. Here, we leverage pLMs to simultaneously model both modalities in a single model. We encode protein structures as token sequences using the 3Di-alphabet introduced by the 3D-alignment method <i>Foldseek</i>. For training, we built a non-redundant dataset from AlphaFoldDB and fine-tuned an existing pLM (ProtT5) to translate between 3Di and amino acid sequences. As a proof-of-concept for our novel approach, dubbed Protein 'structure-sequence' T5 (<i>ProstT5</i>), we showed improved performance for subsequent, structure-related prediction tasks, leading to three orders of magnitude speedup for deriving 3Di. This will be crucial for future applications trying to search metagenomic sequence databases at the sensitivity of structure comparisons. Our work showcased the potential of pLMs to tap into the information-rich protein structure revolution fueled by AlphaFold2. <i>ProstT5</i> paves the way to develop new tools integrating the vast resource of 3D predictions and opens new research avenues in the post-AlphaFold2 era.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae150"},"PeriodicalIF":4.0,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11616678/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to 'NFixDB (Nitrogen Fixation DataBase)-a comprehensive integrated database for robust 'omics analysis of diazotrophs'. 对“NFixDB(固氮数据库)”的修正-一个全面的集成数据库,用于强大的“重氮营养体组学分析”。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2024-11-15 eCollection Date: 2024-12-01 DOI: 10.1093/nargab/lqae164
{"title":"Correction to 'NFixDB (Nitrogen Fixation DataBase)-a comprehensive integrated database for robust 'omics analysis of diazotrophs'.","authors":"","doi":"10.1093/nargab/lqae164","DOIUrl":"10.1093/nargab/lqae164","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.1093/nar/lqae063.].</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae164"},"PeriodicalIF":4.0,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11616680/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pymportx: facilitating next-generation transcriptomics analysis in Python. Pymportx:促进Python中的下一代转录组学分析。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2024-11-15 eCollection Date: 2024-12-01 DOI: 10.1093/nargab/lqae160
Paula Pena González, Dafne Lozano-Paredes, José Luis Rojo-Álvarez, Luis Bote-Curiel, Víctor Javier Sánchez-Arévalo Lobo
{"title":"Pymportx: facilitating next-generation transcriptomics analysis in Python.","authors":"Paula Pena González, Dafne Lozano-Paredes, José Luis Rojo-Álvarez, Luis Bote-Curiel, Víctor Javier Sánchez-Arévalo Lobo","doi":"10.1093/nargab/lqae160","DOIUrl":"10.1093/nargab/lqae160","url":null,"abstract":"<p><p>The efficient importation of quantified gene expression data is pivotal in transcriptomics. Historically, the R package Tximport addressed this need by enabling seamless data integration from various quantification tools. However, the Python community lacked a corresponding tool, restricting cross-platform bioinformatics interoperability. We introduce Pymportx, a Python adaptation of Tximport, which replicates and extends the original package's functionalities. Pymportx maintains the integrity and accuracy of gene expression data while improving processing speed and integration within the Python ecosystem. It supports new data formats and includes tools for enhanced data exploration and analysis. Available under the MIT license, Pymportx integrates smoothly with Python's bioinformatics tools, facilitating a unified and efficient workflow across the R and Python ecosystems. This advancement not only broadens access to Python's extensive toolset but also fosters interdisciplinary collaboration and the development of cutting-edge bioinformatics analyses.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae160"},"PeriodicalIF":4.0,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11616679/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142781318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A general kernel machine regression framework using principal component analysis for jointly testing main and interaction effects: Applications to human microbiome studies. 利用主成分分析联合测试主效应和交互效应的通用核机器回归框架:应用于人类微生物组研究。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2024-11-12 eCollection Date: 2024-09-01 DOI: 10.1093/nargab/lqae148
Hyunwook Koh
{"title":"A general kernel machine regression framework using principal component analysis for jointly testing main and interaction effects: Applications to human microbiome studies.","authors":"Hyunwook Koh","doi":"10.1093/nargab/lqae148","DOIUrl":"https://doi.org/10.1093/nargab/lqae148","url":null,"abstract":"<p><p>The effect of a treatment on a health or disease response can be modified by genetic or microbial variants. It is the matter of interaction effects between genetic or microbial variants and a treatment. To powerfully discover genetic or microbial biomarkers, it is crucial to incorporate such interaction effects in addition to the main effects. However, in the context of kernel machine regression analysis of its kind, existing methods cannot be utilized in a situation, where a kernel is available but its underlying real variants are unknown. To address such limitations, I introduce a general kernel machine regression framework using principal component analysis for jointly testing main and interaction effects. It begins with extracting principal components from an input kernel through the singular value decomposition. Then, it employs the principal components as surrogate variants to construct three endogenous kernels for the main effects, interaction effects, and both of them, respectively. Hence, it works with a kernel as an input without knowing its underlying real variants, and also detects either the main effects, interaction effects, or both of them robustly. I also introduce its omnibus testing extension to multiple input kernels, named OmniK. I demonstrate its use for human microbiome studies.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae148"},"PeriodicalIF":4.0,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11555437/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142629627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Refining SARS-CoV-2 intra-host variation by leveraging large-scale sequencing data. 利用大规模测序数据完善 SARS-CoV-2 宿主内变异。
IF 4
NAR Genomics and Bioinformatics Pub Date : 2024-11-12 eCollection Date: 2024-09-01 DOI: 10.1093/nargab/lqae145
Fatima Mostefai, Jean-Christophe Grenier, Raphaël Poujol, Julie Hussin
{"title":"Refining SARS-CoV-2 intra-host variation by leveraging large-scale sequencing data.","authors":"Fatima Mostefai, Jean-Christophe Grenier, Raphaël Poujol, Julie Hussin","doi":"10.1093/nargab/lqae145","DOIUrl":"https://doi.org/10.1093/nargab/lqae145","url":null,"abstract":"<p><p>Understanding viral genome evolution during host infection is crucial for grasping viral diversity and evolution. Analyzing intra-host single nucleotide variants (iSNVs) offers insights into new lineage emergence, which is important for predicting and mitigating future viral threats. Despite next-generation sequencing's potential, challenges persist, notably sequencing artifacts leading to false iSNVs. We developed a workflow to enhance iSNV detection in large NGS libraries, using over 130 000 SARS-CoV-2 libraries to distinguish mutations from errors. Our approach integrates bioinformatics protocols, stringent quality control, and dimensionality reduction to tackle batch effects and improve mutation detection reliability. Additionally, we pioneer the application of the PHATE visualization approach to genomic data and introduce a methodology that quantifies how related groups of data points are represented within a two-dimensional space, enhancing clustering structure explanation based on genetic similarities. This workflow advances accurate intra-host mutation detection, facilitating a deeper understanding of viral diversity and evolution.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae145"},"PeriodicalIF":4.0,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11555433/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142629558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信