Bioinformatics advances最新文献

筛选
英文 中文
TraitTrainR: accelerating large-scale simulation under models of continuous trait evolution. TraitTrainR:在连续性状进化模型下加速大规模模拟。
IF 2.4
Bioinformatics advances Pub Date : 2024-12-09 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbae196
Jenniffer Roa Lozano, Mataya Duncan, Duane D McKenna, Todd A Castoe, Michael DeGiorgio, Richard Adams
{"title":"TraitTrainR: accelerating large-scale simulation under models of continuous trait evolution.","authors":"Jenniffer Roa Lozano, Mataya Duncan, Duane D McKenna, Todd A Castoe, Michael DeGiorgio, Richard Adams","doi":"10.1093/bioadv/vbae196","DOIUrl":"10.1093/bioadv/vbae196","url":null,"abstract":"<p><strong>Motivation: </strong>The scale and scope of comparative trait data are expanding at unprecedented rates, and recent advances in evolutionary modeling and simulation sometimes struggle to match this pace. Well-organized and flexible applications for conducting large-scale simulations of evolution hold promise in this context for understanding models and more so our ability to confidently estimate them with real trait data sampled from nature.</p><p><strong>Results: </strong>We introduce <i>TraitTrainR</i>, an R package designed to facilitate efficient, large-scale simulations under complex models of continuous trait evolution. <i>TraitTrainR</i> employs several output formats, supports popular trait data transformations, accommodates multi-trait evolution, and exhibits flexibility in defining input parameter space and model stacking. Moreover, <i>TraitTrainR</i> permits measurement error, allowing for investigation of its potential impacts on evolutionary inference. We envision a wealth of applications of <i>TraitTrainR</i>, and we demonstrate one such example by examining the problem of evolutionary model selection in three empirical phylogenetic case studies. Collectively, these demonstrations of applying <i>TraitTrainR</i> to explore problems in model selection underscores its utility and broader promise for addressing key questions, including those related to experimental design and statistical power, in comparative biology.</p><p><strong>Availability and implementation: </strong><i>TraitTrainR</i> is developed in R 4.4.0 and is freely available at https://github.com/radamsRHA/TraitTrainR/, which includes detailed documentation, quick-start guides, and a step-by-step tutorial.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbae196"},"PeriodicalIF":2.4,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11696700/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142933947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CCfrag: scanning folding potential of coiled-coil fragments with AlphaFold. CCfrag:用AlphaFold扫描线圈碎片的折叠电位。
IF 2.4
Bioinformatics advances Pub Date : 2024-12-06 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbae195
Mikel Martinez-Goikoetxea
{"title":"CCfrag: scanning folding potential of coiled-coil fragments with AlphaFold.","authors":"Mikel Martinez-Goikoetxea","doi":"10.1093/bioadv/vbae195","DOIUrl":"10.1093/bioadv/vbae195","url":null,"abstract":"<p><strong>Motivation: </strong>Coiled coils are a widespread structural motif consisting of multiple α-helices that wind around a central axis to bury their hydrophobic core. While AlphaFold has emerged as an effective coiled-coil modeling tool, capable of accurately predicting changes in periodicity and core geometry along coiled-coil stalks, it is not without limitations, such as the generation of spuriously bent models and the inability to effectively model globally non-canonical-coiled coils. To overcome these limitations, we investigated whether dividing full-length sequences into fragments would result in better models.</p><p><strong>Results: </strong>We developed CCfrag to leverage AlphaFold for the piece-wise modeling of coiled coils. The user can create a specification, defined by window size, length of overlap, and oligomerization state, and the program produces the files necessary to run AlphaFold predictions. The structural models and their scores are then integrated into a rich per-residue representation defined by sequence- or structure-based features. Our results suggest that removing coiled-coil sequences from their native context can improve prediction confidence and results in better models. In this article, we present various use cases of CCfrag and propose that fragment-based prediction is useful for understanding the properties of long, fibrous coiled coils by revealing local features not seen in full-length models.</p><p><strong>Availability and implementation: </strong>The program is implemented as a Python module. The code and its documentation are available at https://github.com/Mikel-MG/CCfrag.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbae195"},"PeriodicalIF":2.4,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11676326/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142904215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MSA clustering enhances AF-Multimer's ability to predict conformational landscapes of protein-protein interactions. MSA聚类增强了af - multitimer预测蛋白质相互作用构象景观的能力。
IF 2.4
Bioinformatics advances Pub Date : 2024-12-06 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbae197
Khondamir R Rustamov, Artyom Y Baev
{"title":"MSA clustering enhances AF-Multimer's ability to predict conformational landscapes of protein-protein interactions.","authors":"Khondamir R Rustamov, Artyom Y Baev","doi":"10.1093/bioadv/vbae197","DOIUrl":"10.1093/bioadv/vbae197","url":null,"abstract":"<p><strong>Motivation: </strong>Understanding the conformational landscape of protein-ligand interactions is critical for elucidating the binding mechanisms that govern these interactions. Traditional methods like molecular dynamics (MD) simulations are computationally intensive, leading to a demand for more efficient approaches. This study explores how multiple sequence alignment (MSA) clustering enhance AF-Multimer's ability to predict conformational landscapes, particularly for proteins with multiple conformational states.</p><p><strong>Results: </strong>We verified this approach by predicting the conformational landscapes of chemokine receptor 4 (CXCR4) and glucagon receptor (GCGR) in the presence of their agonists and antagonists. In our experiments, AF-Multimer predicted the structures of CXCR4 and GCGR predominantly in active state in the presence of agonists and in inactive state in the presence of antagonists. Moreover, we tested our approach with proteins known to switch between monomeric and dimeric states, such as lymphotactin, SH3, and thermonuclease. AFcluster-Multimer accurately predicted conformational states during oligomerization, which AFcluster with AlphaFold2 alone fails to achieve. In conclusion, MSA clustering enhances AF-Multimer's ability to predict protein conformational landscapes and mechanistic effects of ligand binding, offering a robust tool for understanding protein-ligand interactions.</p><p><strong>Availability and implementation: </strong>Code for running AFcluster-Multimer is available at https://github.com/KhondamirRustamov/AF-Multimer-cluster.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbae197"},"PeriodicalIF":2.4,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11671036/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142904218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint extraction of entity and relation based on fine-tuning BERT for long biomedical literatures. 基于微调BERT的生物医学文献实体与关系联合抽取。
IF 2.4
Bioinformatics advances Pub Date : 2024-12-05 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae194
Ting Gao, Xue Zhai, Chuan Yang, Linlin Lv, Han Wang
{"title":"Joint extraction of entity and relation based on fine-tuning BERT for long biomedical literatures.","authors":"Ting Gao, Xue Zhai, Chuan Yang, Linlin Lv, Han Wang","doi":"10.1093/bioadv/vbae194","DOIUrl":"10.1093/bioadv/vbae194","url":null,"abstract":"<p><strong>Motivation: </strong>Joint extraction of entity and relation is an important research direction in Information Extraction. The number of scientific and technological biomedical literature is rapidly increasing, so automatically extracting entities and their relations from these literatures are key tasks to promote the progress of biomedical research.</p><p><strong>Results: </strong>The joint extraction of entity and relation model achieves both intra-sentence extraction and cross-sentence extraction, alleviating the problem of long-distance information dependence in long literature. Joint extraction of entity and relation model incorporates a variety of advanced deep learning techniques in this paper: (i) a fine-tuning BERT text classification pre-training model, (ii) Graph Convolutional Network learning method, (iii) Robust Learning Against Textual Label Noise with Self-Mixup Training, (iv) Local regularization Conditional Random Fields. The model implements the following functions: identifying entities from complex biomedical literature effectively, extracting triples within and across sentences, reducing the effect of noisy data during training, and improving the robustness and accuracy of the model. The experiment results prove that the model performs well on the self-built BM_GBD dataset and public datasets, enabling precise large language model enhanced knowledge graph construction for biomedical tasks.</p><p><strong>Availability and implementation: </strong>The model and partial code are available on GitHub at https://github.com/zhaix922/Joint-extraction-of-entity-and-relation.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae194"},"PeriodicalIF":2.4,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11665630/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142883311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HLA-EpiCheck: novel approach for HLA B-cell epitope prediction using 3D-surface patch descriptors derived from molecular dynamic simulations. HLA- epicheck:利用分子动力学模拟衍生的3d表面贴片描述符预测HLA b细胞表位的新方法。
IF 2.4
Bioinformatics advances Pub Date : 2024-12-05 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae186
Diego Amaya-Ramirez, Magali Devriese, Romain Lhotte, Cédric Usureau, Malika Smaïl-Tabbone, Jean-Luc Taupin, Marie-Dominique Devignes
{"title":"HLA-EpiCheck: novel approach for HLA B-cell epitope prediction using 3D-surface patch descriptors derived from molecular dynamic simulations.","authors":"Diego Amaya-Ramirez, Magali Devriese, Romain Lhotte, Cédric Usureau, Malika Smaïl-Tabbone, Jean-Luc Taupin, Marie-Dominique Devignes","doi":"10.1093/bioadv/vbae186","DOIUrl":"10.1093/bioadv/vbae186","url":null,"abstract":"<p><strong>Motivation: </strong>The human leukocyte antigen (HLA) system is the main cause of organ transplant loss through the recognition of HLAs present on the graft by donor-specific antibodies raised by the recipient. It is therefore of key importance to identify all potentially immunogenic B-cell epitopes on HLAs in order to refine organ allocation. Such HLAs epitopes are currently characterized by the presence of polymorphic residues called \"eplets\". However, many polymorphic positions in HLAs sequences are not yet experimentally confirmed as eplets associated with a HLA epitope. Moreover, structural studies of these epitopes only consider 3D static structures.</p><p><strong>Results: </strong>We present here a machine-learning approach for predicting HLA epitopes, based on 3D-surface patches and molecular dynamics simulations. A collection of 3D-surface patches labeled as Epitope (2117) or Nonepitope (4769) according to Human Leukocyte Antigen Eplet Registry information was derived from 207 HLAs (61 solved and 146 predicted structures). Descriptors derived from static and dynamic patch properties were computed and three tree-based models were trained on a reduced non-redundant dataset. HLA-Epicheck is the prediction system formed by the three models. It leverages dynamic descriptors of 3D-surface patches for more than half of its prediction performance. Epitope predictions on unconfirmed eplets (absent from the initial dataset) are compared with experimental results and notable consistency is found.</p><p><strong>Availability and implementation: </strong>Structural data and MD trajectories are deposited as open data under doi: 10.57745/GXZHH8. In-house scripts and machine-learning models for HLA-EpiCheck are available from https://gitlab.inria.fr/capsid.public_codes/hla-epicheck.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae186"},"PeriodicalIF":2.4,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631505/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142808684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Negative binomial mixture model for identification of noise in antibody-antigen specificity predictions from single-cell data. 从单细胞数据中识别抗体-抗原特异性预测噪声的负二项混合模型。
IF 2.4
Bioinformatics advances Pub Date : 2024-12-04 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae170
Perry T Wasdin, Alexandra A Abu-Shmais, Michael W Irvin, Matthew J Vukovich, Ivelin S Georgiev
{"title":"Negative binomial mixture model for identification of noise in antibody-antigen specificity predictions from single-cell data.","authors":"Perry T Wasdin, Alexandra A Abu-Shmais, Michael W Irvin, Matthew J Vukovich, Ivelin S Georgiev","doi":"10.1093/bioadv/vbae170","DOIUrl":"10.1093/bioadv/vbae170","url":null,"abstract":"<p><strong>Motivation: </strong>LIBRA-seq (linking B cell receptor to antigen specificity by sequencing) provides a powerful tool for interrogating the antigen-specific B cell compartment and identifying antibodies against antigen targets of interest. Identification of noise in single-cell B cell receptor sequencing data, such as LIBRA-seq, is critical for improving antigen binding predictions for downstream applications including antibody discovery and machine learning technologies.</p><p><strong>Results: </strong>In this study, we present a method for denoising LIBRA-seq data by clustering antigen counts into signal and noise components with a negative binomial mixture model. This approach leverages single-cell sequencing reads from a large, multi-donor dataset described in a recent LIBRA-seq study to develop a data-driven means for identification of technical noise. We apply this method to nine donors representing separate LIBRA-seq experiments and show that our approach provides improved predictions for <i>in vitro</i> antibody-antigen binding when compared to the standard scoring method, despite variance in data size and noise structure across samples. This development will improve the ability of LIBRA-seq to identify antigen-specific B cells and contribute to providing more reliable datasets for machine learning based approaches as the corpus of single-cell B cell sequencing data continues to grow.</p><p><strong>Availability and implementation: </strong>All data and code are available at https://github.com/IGlab-VUMC/mixture_model_denoising.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae170"},"PeriodicalIF":2.4,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631427/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142808678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing design of genomics studies for clonal evolution analysis. 优化基因组学研究设计,促进克隆进化分析。
IF 2.4
Bioinformatics advances Pub Date : 2024-12-02 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae193
Arjun Srivatsa, Russell Schwartz
{"title":"Optimizing design of genomics studies for clonal evolution analysis.","authors":"Arjun Srivatsa, Russell Schwartz","doi":"10.1093/bioadv/vbae193","DOIUrl":"10.1093/bioadv/vbae193","url":null,"abstract":"<p><strong>Motivation: </strong>Genomic biotechnology has rapidly advanced, allowing for the inference and modification of genetic and epigenetic information at the single-cell level. While these tools hold enormous potential for basic and clinical research, they also raise difficult issues of how to design studies to deploy them most effectively. In designing a genomic study, a modern researcher might combine many sequencing modalities and sampling protocols, each with different utility, costs, and other tradeoffs. This is especially relevant for studies of somatic variation, which may involve highly heterogeneous cell populations whose differences can be probed <i>via</i> an extensive set of biotechnological tools. Efficiently deploying genomic technologies in this space will require principled ways to create study designs that recover desired genomic information while minimizing various measures of cost.</p><p><strong>Results: </strong>The central problem this paper attempts to address is how one might create an optimal study design for a genomic analysis, with particular focus on studies involving somatic variation that occur most often with application to cancer genomics. We pose the study design problem as a stochastic constrained nonlinear optimization problem. We introduce a Bayesian optimization framework that iteratively optimizes for an objective function using surrogate modeling combined with pattern and gradient search. We demonstrate our procedure on several test cases to derive resource and study design allocations optimized for various goals and criteria, demonstrating its ability to optimize study designs efficiently across diverse scenarios.</p><p><strong>Availability and implementation: </strong>https://github.com/CMUSchwartzLab/StudyDesignOptimization.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae193"},"PeriodicalIF":2.4,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11645549/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142831013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
epiTCR-KDA: knowledge distillation model on dihedral angles for TCR-peptide prediction. epiTCR-KDA:用于 TCR 肽预测的二面角知识蒸馏模型。
IF 2.4
Bioinformatics advances Pub Date : 2024-11-29 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae190
My-Diem Nguyen Pham, Chinh Tran-To Su, Thanh-Nhan Nguyen, Hoai-Nghia Nguyen, Dinh Duy An Nguyen, Hoa Giang, Dinh-Thuc Nguyen, Minh-Duy Phan, Vy Nguyen
{"title":"epiTCR-KDA: knowledge distillation model on dihedral angles for TCR-peptide prediction.","authors":"My-Diem Nguyen Pham, Chinh Tran-To Su, Thanh-Nhan Nguyen, Hoai-Nghia Nguyen, Dinh Duy An Nguyen, Hoa Giang, Dinh-Thuc Nguyen, Minh-Duy Phan, Vy Nguyen","doi":"10.1093/bioadv/vbae190","DOIUrl":"10.1093/bioadv/vbae190","url":null,"abstract":"<p><strong>Motivation: </strong>The prediction of the T-cell receptor (TCR) and antigen bindings is crucial for advancements in immunotherapy. However, most current TCR-peptide interaction predictors struggle to perform well on unseen data. This limitation may stem from the conventional use of TCR and/or peptide sequences as input, which may not adequately capture their structural characteristics. Therefore, incorporating the structural information of TCRs and peptides into the prediction model is necessary to improve its generalizability.</p><p><strong>Results: </strong>We developed epiTCR-KDA (KDA stands for Knowledge Distillation model on Dihedral Angles), a new predictor of TCR-peptide binding that utilizes the dihedral angles between the residues of the peptide and the TCR as a structural descriptor. This structural information was integrated into a knowledge distillation model to enhance its generalizability. epiTCR-KDA demonstrated competitive prediction performance, with an area under the curve (AUC) of 1.00 for seen data and AUC of 0.91 for unseen data. On public datasets, epiTCR-KDA consistently outperformed other predictors, maintaining a median AUC of 0.93. Further analysis of epiTCR-KDA revealed that the cosine similarity of the dihedral angle vectors between the unseen testing data and training data is crucial for its stable performance. In conclusion, our epiTCR-KDA model represents a significant step forward in developing a highly effective pipeline for antigen-based immunotherapy.</p><p><strong>Availability and implementation: </strong>epiTCR-KDA is available on GitHub (https://github.com/ddiem-ri-4D/epiTCR-KDA).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae190"},"PeriodicalIF":2.4,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11646569/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142831005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient genome monomer higher-order structure annotation and identification using the GRMhor algorithm. 基于GRMhor算法的高效基因组单体高阶结构标注与识别。
IF 2.4
Bioinformatics advances Pub Date : 2024-11-28 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae191
Matko Glunčić, Domjan Barić, Vladimir Paar
{"title":"Efficient genome monomer higher-order structure annotation and identification using the GRMhor algorithm.","authors":"Matko Glunčić, Domjan Barić, Vladimir Paar","doi":"10.1093/bioadv/vbae191","DOIUrl":"10.1093/bioadv/vbae191","url":null,"abstract":"<p><strong>Motivation: </strong>Tandem monomeric units, integral components of eukaryotic genomes, form higher-order repeat (HOR) structures that play crucial roles in maintaining chromosome integrity and regulating gene expression and protein abundance. Given their significant influence on processes such as evolution, chromosome segregation, and disease, developing a sensitive and automated tool for identifying HORs across diverse genomic sequences is essential.</p><p><strong>Results: </strong>In this study, we applied the GRMhor (Global Repeat Map hor) algorithm to analyse the centromeric region of chromosome 20 in three individual human genomes, as well as in the centromeric regions of three higher primates. In all three human genomes, we identified six distinct HOR arrays, which revealed significantly greater differences in the number of canonical and variant copies, as well as in their overall structure, than would be expected given the 99.9% genetic similarity among humans. Furthermore, our analysis of higher primate genomes, which revealed entirely different HOR sequences, indicates a much larger genomic divergence between humans and higher primates than previously recognized. These results underscore the suitability of the GRMhor algorithm for studying specificities in individual genomes, particularly those involving repetitive monomers in centromere structure, which is essential for proper chromosome segregation during cell division, while also highlighting its utility in exploring centromere evolution and other repetitive genomic regions.</p><p><strong>Availability and implementation: </strong>Source code and example binaries freely available for download at github.com/gluncic/GRM2023.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae191"},"PeriodicalIF":2.4,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630843/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142808682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Protomix: a Python package for 1H-NMR metabolomics data preprocessing. Protomix:用于1H-NMR代谢组学数据预处理的Python包。
IF 2.4
Bioinformatics advances Pub Date : 2024-11-27 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbae192
Mohammed Zniber, Youssef Fatihi, Tan-Phat Huynh
{"title":"Protomix: a Python package for <sup>1</sup>H-NMR metabolomics data preprocessing.","authors":"Mohammed Zniber, Youssef Fatihi, Tan-Phat Huynh","doi":"10.1093/bioadv/vbae192","DOIUrl":"10.1093/bioadv/vbae192","url":null,"abstract":"<p><strong>Motivation: </strong>NMR-based metabolomics is a field driven by technological advancements, necessitating the use of advanced preprocessing tools. Despite this need, there is a remarkable scarcity of comprehensive and user-friendly preprocessing tools in Python. To bridge this gap, we have developed Protomix-a Python package designed for metabolomics research. Protomix offers a set of automated, efficient, and user-friendly signal-preprocessing steps, tailored to streamline and enhance the preprocessing phase in metabolomics studies.</p><p><strong>Results: </strong>This package presents a comprehensive preprocessing pipeline compatible with various data analysis tools. It encompasses a suite of functionalities for data extraction, preprocessing, and interactive visualization. Additionally, it includes a tutorial in the form of a Python Jupyter notebook, specifically designed for the analysis of 1D <sup>1</sup>H-NMR metabolomics data related to prostate cancer and benign prostatic hyperplasia.</p><p><strong>Availability and implementation: </strong>Protomix can be accessed at https://github.com/mzniber/protomix and https://protomix.readthedocs.io/en/latest/index.html.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbae192"},"PeriodicalIF":2.4,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11671038/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142904222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信