BMC Bioinformatics最新文献

筛选
英文 中文
ChIPbinner: an R package for analyzing broad histone marks binned in uniform windows from ChIP-Seq or CUT&RUN/TAG data.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2025-03-24 DOI: 10.1186/s12859-025-06103-6
Reinnier Padilla, Eric Bareke, Bo Hu, Jacek Majewski
{"title":"ChIPbinner: an R package for analyzing broad histone marks binned in uniform windows from ChIP-Seq or CUT&RUN/TAG data.","authors":"Reinnier Padilla, Eric Bareke, Bo Hu, Jacek Majewski","doi":"10.1186/s12859-025-06103-6","DOIUrl":"10.1186/s12859-025-06103-6","url":null,"abstract":"<p><strong>Background: </strong>The decreasing costs of sequencing, along with the growing understanding of epigenetic mechanisms driving diseases, have led to the increased application of chromatin immunoprecipitation (ChIP), Cleavage Under Targets & Release Using Nuclease (CUT&RUN) and Cleavage Under Targets and Tagmentation (CUT&TAG) sequencing-which are designed to map DNA or chromatin-binding proteins to their genome targets-in biomedical research. Existing software tools, namely peak-callers, are available for analyzing data from these technologies, although they often struggle with diffuse and broad signals, such as those associated with broad histone post-translational modifications (PTMs).</p><p><strong>Results: </strong>To address this limitation, we present ChIPbinner, an open-source R package tailored for reference-agnostic analysis of broad PTMs. Instead of relying on pre-identified enriched regions from peak-callers, ChIPbinner divides (bins) the genome into uniform windows. Thus, users are provided with an unbiased method to explore genome-wide differences between two samples using scatterplots, principal component analysis (PCA), and correlation plots. It also facilitates the identification and characterization of differential clusters of bins, allowing users to focus on specific genomic regions significantly affected by treatments or mutations. We demonstrated the effectiveness of this tool through simulated datasets and a case study assessing H3K36me2 depletion following NSD1 knockout in head and neck squamous cell carcinoma, highlighting the advantages of ChIPbinner in detecting broad histone mark changes over existing software.</p><p><strong>Conclusions: </strong>Binned analysis provides a more holistic view of the genomic landscape, allowing researchers to uncover broader patterns and correlations that may be missed when solely focusing on individual peaks. ChIPbinner offers researchers a convenient tool to perform binned analysis. It improves on previously published software by providing a clustering approach that is independent of each bin's differential enrichment status and more precisely identifies differentially bound regions for broad histone marks, while also offering additional features for downstream analysis of these differentially enriched bins.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"89"},"PeriodicalIF":2.9,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11934474/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143699346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncovering latent biological function associations through gene set embeddings.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2025-03-24 DOI: 10.1186/s12859-025-06100-9
Yuhang Huang, Fan Zhong, Lei Liu
{"title":"Uncovering latent biological function associations through gene set embeddings.","authors":"Yuhang Huang, Fan Zhong, Lei Liu","doi":"10.1186/s12859-025-06100-9","DOIUrl":"10.1186/s12859-025-06100-9","url":null,"abstract":"<p><strong>Background: </strong>The complexity of biological systems has increasingly been unraveled through computational methods, with biological network analysis now focusing on the construction and exploration of well-defined interaction networks. Traditional graph-theoretical approaches have been instrumental in mapping key biological processes using high-confidence interaction data. However, these methods often struggle with incomplete or/and heterogeneous datasets. In this study, we extend beyond conventional bipartite models by integrating attribute-driven knowledge from the Molecular Signatures Database (MSigDB) using the node2vec algorithm.</p><p><strong>Results: </strong>Our approach explores unsupervised biological relationships and uncovers potential associations between genes and biological terms through network connectivity analysis. By embedding both human and mouse data into a shared vector space, we validate our findings cross-species, further strengthening the robustness of our method.</p><p><strong>Conclusions: </strong>This integrative framework reveals both expected and novel biological insights, offering a comprehensive perspective that complements traditional biological network analysis and paves the way for deeper understanding of complex biological processes and diseases.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"90"},"PeriodicalIF":2.9,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11934463/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143699291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2025-03-22 DOI: 10.1186/s12859-025-06105-4
Yasemin Bridges, Vinicius de Souza, Katherina G Cortes, Melissa Haendel, Nomi L Harris, Daniel R Korn, Nikolaos M Marinakis, Nicolas Matentzoglu, James A McLaughlin, Christopher J Mungall, Aaron Odell, David Osumi-Sutherland, Peter N Robinson, Damian Smedley, Julius O B Jacobsen
{"title":"Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework.","authors":"Yasemin Bridges, Vinicius de Souza, Katherina G Cortes, Melissa Haendel, Nomi L Harris, Daniel R Korn, Nikolaos M Marinakis, Nicolas Matentzoglu, James A McLaughlin, Christopher J Mungall, Aaron Odell, David Osumi-Sutherland, Peter N Robinson, Damian Smedley, Julius O B Jacobsen","doi":"10.1186/s12859-025-06105-4","DOIUrl":"10.1186/s12859-025-06105-4","url":null,"abstract":"<p><strong>Background: </strong>Computational approaches to support rare disease diagnosis are challenging to build, requiring the integration of complex data types such as ontologies, gene-to-phenotype associations, and cross-species data into variant and gene prioritisation algorithms (VGPAs). However, the performance of VGPAs has been difficult to measure and is impacted by many factors, for example, ontology structure, annotation completeness or changes to the underlying algorithm. Assertions of the capabilities of VGPAs are often not reproducible, in part because there is no standardised, empirical framework and openly available patient data to assess the efficacy of VGPAs-ultimately hindering the development of effective prioritisation tools.</p><p><strong>Results: </strong>In this paper, we present our benchmarking tool, PhEval, which aims to provide a standardised and empirical framework to evaluate phenotype-driven VGPAs. The inclusion of standardised test corpora and test corpus generation tools in the PhEval suite of tools allows open benchmarking and comparison of methods on standardised data sets.</p><p><strong>Conclusions: </strong>PhEval and the standardised test corpora solve the issues of patient data availability and experimental tooling configuration when benchmarking and comparing rare disease VGPAs. By providing standardised data on patient cohorts from real-world case-reports and controlling the configuration of evaluated VGPAs, PhEval enables transparent, portable, comparable and reproducible benchmarking of VGPAs. As these tools are often a key component of many rare disease diagnostic pipelines, a thorough and standardised method of assessment is essential for improving patient diagnosis and care.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"87"},"PeriodicalIF":2.9,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11929307/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143690924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep-ProBind: binding protein prediction with transformer-based deep learning model. Deep-ProBind:利用基于变换器的深度学习模型预测结合蛋白。
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2025-03-22 DOI: 10.1186/s12859-025-06101-8
Salman Khan, Sumaiya Noor, Hamid Hussain Awan, Shehryar Iqbal, Salman A AlQahtani, Naqqash Dilshad, Nijad Ahmad
{"title":"Deep-ProBind: binding protein prediction with transformer-based deep learning model.","authors":"Salman Khan, Sumaiya Noor, Hamid Hussain Awan, Shehryar Iqbal, Salman A AlQahtani, Naqqash Dilshad, Nijad Ahmad","doi":"10.1186/s12859-025-06101-8","DOIUrl":"10.1186/s12859-025-06101-8","url":null,"abstract":"<p><p>Binding proteins play a crucial role in biological systems by selectively interacting with specific molecules, such as DNA, RNA, or peptides, to regulate various cellular processes. Their ability to recognize and bind target molecules with high specificity makes them essential for signal transduction, transport, and enzymatic activity. Traditional experimental methods for identifying protein-binding peptides are costly and time-consuming. Current sequence-based approaches often struggle with accuracy, focusing too narrowly on proximal sequence features and ignoring structural data. This study presents Deep-ProBind, a powerful prediction model designed to classify protein binding sites by integrating sequence and structural information. The proposed model employs a transformer and evolutionary-based attention mechanism, i.e., Bidirectional Encoder Representations from Transformers (BERT) and Pseudo position specific scoring matrix -Discrete Wavelet Transform (PsePSSM -DWT) approach to encode peptides. The SHapley Additive exPlanations (SHAP) algorithm selects the optimal hybrid features, and a Deep Neural Network (DNN) is then used as the classification algorithm to predict protein-binding peptides. The performance of the proposed model was evaluated in comparison with traditional Machine Learning (ML) algorithms and existing models. Experimental results demonstrate that Deep-ProBind achieved 92.67% accuracy with tenfold cross-validation on benchmark datasets and 93.62% accuracy on independent samples. The Deep-ProBind outperforms existing models by 3.57% on training data and 1.52% on independent tests. These results demonstrate Deep-ProBind's reliability and effectiveness, making it a valuable tool for researchers and a potential resource in pharmacological studies, where peptide binding plays a critical role in therapeutic development.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"88"},"PeriodicalIF":2.9,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11929993/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143690921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of drug's anatomical therapeutic chemical (ATC) code by constructing biological profiles of ATC codes. 通过构建 ATC 代码的生物特征,预测药物的解剖治疗化学(ATC)代码。
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2025-03-21 DOI: 10.1186/s12859-025-06102-7
Lei Chen, Yiwen Lu, Jing Xu, Bo Zhou
{"title":"Prediction of drug's anatomical therapeutic chemical (ATC) code by constructing biological profiles of ATC codes.","authors":"Lei Chen, Yiwen Lu, Jing Xu, Bo Zhou","doi":"10.1186/s12859-025-06102-7","DOIUrl":"10.1186/s12859-025-06102-7","url":null,"abstract":"<p><strong>Background: </strong>The Anatomical Therapeutic Chemical (ATC) classification system, proposed and maintained by the World Health Organization, is among the most widely used drug classification schemes. Recently, it has become a key research focus in drug repositioning. Computational models often pair drugs with ATC codes to explore drug-ATC code associations. However, the limited information available for ATC codes constrains these models, leaving significant room for improvement.</p><p><strong>Results: </strong>This study presents an inference method to identify highly related target proteins, structural features, and side effects for each ATC code, constructing comprehensive biological profiles. Association networks for target proteins, structural features, and side effects are established, and a random walk with restart algorithm is applied to these networks to extract raw associations. A permutation test is then conducted to exclude false positives, yielding robust biological profiles for ATC codes. These profiles are used to construct new ATC code kernels, which are integrated with ATC code kernels from the existing model PDATC-NCPMKL. The recommendation matrix is subsequently generated using the procedures of PDATC-NCPMKL. Cross-validation results demonstrate that the new model achieves AUROC and AUPR values exceeding 0.96.</p><p><strong>Conclusion: </strong>The proposed model outperforms PDATC-NCPMKL and other previous models. Analysis of the contributions of the newly added ATC code kernels confirms the value of biological profiles in enhancing the prediction of drug-ATC code associations.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"86"},"PeriodicalIF":2.9,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11927162/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Validating a web application's use of genetic distance to determine helminth species boundaries and aid in identification.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2025-03-18 DOI: 10.1186/s12859-025-06098-0
Abigail Hui En Chan, Urusa Thaenkham, Tanaphum Wichaita, Sompob Saralamba
{"title":"Validating a web application's use of genetic distance to determine helminth species boundaries and aid in identification.","authors":"Abigail Hui En Chan, Urusa Thaenkham, Tanaphum Wichaita, Sompob Saralamba","doi":"10.1186/s12859-025-06098-0","DOIUrl":"10.1186/s12859-025-06098-0","url":null,"abstract":"<p><strong>Background: </strong>Parasitic helminths exhibit significant diversity, complicating both morphological and molecular species identification. Moreover, no helminth-specific tool is currently available to aid in species identification of helminths using molecular data. To address this, we developed and validated a straightforward, user-friendly application named Applying Taxonomic Boundaries for Species Identification of Helminths (ABIapp) using R and the Shiny framework. Serving as a preliminary step in species identification, ABIapp is designed to assist in visualizing taxonomic boundaries for nematodes, trematodes, and cestodes. ABIapp employs a database of genetic distance cut-offs determined by the K-means algorithm to establish taxonomic boundaries for ten genetic markers. Validation of ABIapp was performed both in silico and with actual specimens to determine its classification accuracy. The in silico validation involved 591 genetic distances sourced from 117 publications, while the validation with actual specimens utilized ten specimens. ABIapp's accuracy was also compared with other online platforms to ensure its robustness to assist in helminth identification.</p><p><strong>Results: </strong>ABIapp achieved an overall classification accuracy of 76% for in silico validation and 75% for actual specimens. Additionally, compared to other platforms, the classification accuracy of ABIapp was superior, proving its effectiveness to determine helminth taxonomic boundaries. With its user-friendly interface, minimal data input requirements, and precise classification capabilities, ABIapp offers multiple benefits for helminth researchers and can aid in identification.</p><p><strong>Conclusions: </strong>Built on a helminth-specific database, ABIapp serves as a pioneering tool for helminth researchers, offering an invaluable resource for determining species boundaries and aiding in species identification of helminths. The availability of ABIapp to the community of helminth researchers may further enhance research in the field of helminthology. To enhance ABIapp's accuracy and utility, the database will be updated annually.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"85"},"PeriodicalIF":2.9,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11917154/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPGA-based accelerator for adaptive banded event alignment in nanopore sequencing data analysis.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2025-03-17 DOI: 10.1186/s12859-024-06011-1
Yilin Feng, Zheyu Li, Gulsum Gudukbay Akbulut, Vijaykrishnan Narayanan, Mahmut Taylan Kandemir, Chita R Das
{"title":"FPGA-based accelerator for adaptive banded event alignment in nanopore sequencing data analysis.","authors":"Yilin Feng, Zheyu Li, Gulsum Gudukbay Akbulut, Vijaykrishnan Narayanan, Mahmut Taylan Kandemir, Chita R Das","doi":"10.1186/s12859-024-06011-1","DOIUrl":"10.1186/s12859-024-06011-1","url":null,"abstract":"<p><strong>Background: </strong>Adaptive Banded Event Alignment (ABEA) stands as a critical algorithmic component in sequence polishing and DNA methylation detection, employing dynamic programming to align raw Nanopore signal with reference reads. Motivated by the observation that, compared to CPUs and GPUs, cutting-edge FPGAs demonstrate-in certain cases-superior performance at a reduced cost and energy consumption, this paper presents an efficient FPGA-based accelerator for ABEA, leveraging the inherent high parallelism and sequential access pattern within ABEA.</p><p><strong>Result: </strong>Our proposed FPGA-based ABEA accelerator significantly enhances ABEA performance compared to the original CPU-based implementation in Nanopolish as well as the state-of-art acceleration on GPU and FPGA platforms. Specifically, targeting Xilinx VU9P, our accelerator achieves an average throughput speedup of 10.05 <math><mo>×</mo></math> over the CPU-only implementation, an average 1.81 <math><mo>×</mo></math> speedup over the state-of-art GPU acceleration with only 7.2% of the energy, and a speedup of 10.11 <math><mo>×</mo></math> compared to an existing FPGA accelerator.</p><p><strong>Conclusion: </strong>Our work demonstrates that intensive genome analysis can benefit significantly from cutting-edge FPGAs, offering improvements in both performance and energy consumption.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"83"},"PeriodicalIF":2.9,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11917103/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143646838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transfer learning for accelerated failure time model with microarray data.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2025-03-17 DOI: 10.1186/s12859-025-06056-w
Yan-Bo Pei, Zheng-Yang Yu, Jun-Shan Shen
{"title":"Transfer learning for accelerated failure time model with microarray data.","authors":"Yan-Bo Pei, Zheng-Yang Yu, Jun-Shan Shen","doi":"10.1186/s12859-025-06056-w","DOIUrl":"10.1186/s12859-025-06056-w","url":null,"abstract":"<p><strong>Background: </strong>In microarray prognostic studies, researchers aim to identify genes associated with disease progression. However, due to the rarity of certain diseases and the cost of sample collection, researchers often face the challenge of limited sample size, which may prevent accurate estimation and risk assessment. This challenge necessitates methods that can leverage information from external data (i.e., source cohorts) to improve gene selection and risk assessment based on the current sample (i.e., target cohort).</p><p><strong>Method: </strong>We propose a transfer learning method for the accelerated failure time (AFT) model to enhance the fit on the target cohort by adaptively borrowing information from the source cohorts. We use a Leave-One-Out cross validation based procedure to evaluate the relative stability of selected genes and overall predictive power.</p><p><strong>Conclusion: </strong>In simulation studies, the transfer learning method for the AFT model can correctly identify a small number of genes, its estimation error is smaller than the estimation error obtained without using the source cohorts. Furthermore, the proposed method demonstrates satisfactory accuracy and robustness in addressing heterogeneity across the cohorts compared to the method that directly combines the target and the source cohorts in the AFT model. We analyze the GSE88770 and GSE25055 data using the proposed method. The selected genes are relatively stable, and the proposed method can make an overall satisfactory risk prediction.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"84"},"PeriodicalIF":2.9,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11917065/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143646842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A clinical knowledge graph-based framework to prioritize candidate genes for facilitating diagnosis of Mendelian diseases and rare genetic conditions.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2025-03-14 DOI: 10.1186/s12859-025-06096-2
Rohan Gnanaolivu, Gavin Oliver, Garrett Jenkinson, Emily Blake, Wenan Chen, Nicholas Chia, Eric W Klee, Chen Wang
{"title":"A clinical knowledge graph-based framework to prioritize candidate genes for facilitating diagnosis of Mendelian diseases and rare genetic conditions.","authors":"Rohan Gnanaolivu, Gavin Oliver, Garrett Jenkinson, Emily Blake, Wenan Chen, Nicholas Chia, Eric W Klee, Chen Wang","doi":"10.1186/s12859-025-06096-2","DOIUrl":"10.1186/s12859-025-06096-2","url":null,"abstract":"<p><strong>Background: </strong>Diagnosing Mendelian and rare genetic conditions requires identifying phenotype-associated genetic findings and prioritizing likely disease-causing genes. This task is labor-intensive for molecular and clinical geneticists, who must review extensive literature and databases to link patient phenotypes with causal genotypes. The challenge is further complicated by the large number of genetic variants detected through next-generation sequencing, which impacts both diagnosis timelines and patient care strategies. To address this, in silico methods that prioritize causal genes based on patient-derived phenotypes offer an effective solution, reducing the time involved in diagnostic case reviews and enhancing the efficiency of clinical diagnosis.</p><p><strong>Results: </strong>We developed the phenotype prioritization and analysis for rare diseases (PPAR) to rank genes based on human phenotype ontology (HPO) terms, with the specific goal of aiding the interpretation of genetic testing for Mendelian and rare diseases. PPAR leverages embeddings from a knowledge graph and incorporates knowledge from connections between genes, HPO terms, and gene ontology annotations. When applied on a clinical rare disease cohort and the publicly available deciphering developmental disorders (DDD) dataset. PPAR ranked the causal gene in the top 10 for 27% of cases in the clinical cohort and for 85% of cases in the DDD dataset, outperforming other established HPO-based methods.</p><p><strong>Conclusion: </strong>Our findings demonstrate that PPAR, a method developed from the clinical knowledge graph, effectively ranks causal genes based on patient-derived HPO terms in rare and Mendelian disease contexts. PPAR has shown superior performance compared to other well-established HPO-only methods and provides an efficient, accessible solution for clinical geneticists. The Python-based tool is publicly available at https://github.com/dimi-lab/PPAR , offering a user-friendly platform for gene prioritization.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"82"},"PeriodicalIF":2.9,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11908102/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143633411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Constructing multilayer PPI networks based on homologous proteins and integrating multiple PageRank to identify essential proteins.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2025-03-10 DOI: 10.1186/s12859-025-06093-5
He Zhao, Huan Xu, Tao Wang, Guixia Liu
{"title":"Constructing multilayer PPI networks based on homologous proteins and integrating multiple PageRank to identify essential proteins.","authors":"He Zhao, Huan Xu, Tao Wang, Guixia Liu","doi":"10.1186/s12859-025-06093-5","DOIUrl":"10.1186/s12859-025-06093-5","url":null,"abstract":"<p><strong>Background: </strong>Predicting and studying essential proteins not only helps to understand the fundamental requirements for cell survival and growth regulation mechanisms but also deepens our understanding of disease mechanisms and drives drug development. Existing methods for identifying essential proteins primarily focus on PPI networks within a single species, without fully exploiting interspecies homologous relationships. These homologous relationships connect proteins from different species, forming multilayer PPI networks. Some methods only construct interlayer edges based on homologous relationships between two species, without incorporating appropriate biological attributes to assess the biological significance of these edges. Furthermore, homologous proteins are often highly conserved across multiple species, and expanding homologous relationships to more species allows for a more accurate assessment of interlayer edge importance.</p><p><strong>Results: </strong>To address these issues, we propose a novel model, MLPR, which constructs a multilayer PPI network based on homologous proteins and integrates multiple PageRank algorithms to identify essential proteins. This study combines homologous protein data from three species to construct interlayer transition matrices and assigns weights to interlayer edges by integrating the biological attributes of homologous proteins and cross-species GO annotations. The MLPR model uses multiple PageRank methods to comprehensively consider homologous relationships across species and designs three key parameters to find the optimal combination that balances random walks within layers, global jumps, interlayer biases, and interspecies homologous relationships.</p><p><strong>Conclusions: </strong>Experimental results show that MLPR outperforms other state-of-the-art methods in terms of performance. Ablation experiments further validate that integrating homologous relationships across three species effectively enhances the overall performance of MLPR and demonstrates the advantages of the multiple PageRank model in identifying essential proteins.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"80"},"PeriodicalIF":2.9,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11892321/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143584491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信