BMC Bioinformatics最新文献

筛选
英文 中文
Novel artificial intelligence-based identification of drug-gene-disease interaction using protein-protein interaction.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2024-12-18 DOI: 10.1186/s12859-024-06009-9
Y-H Taguchi, Turki Turki
{"title":"Novel artificial intelligence-based identification of drug-gene-disease interaction using protein-protein interaction.","authors":"Y-H Taguchi, Turki Turki","doi":"10.1186/s12859-024-06009-9","DOIUrl":"https://doi.org/10.1186/s12859-024-06009-9","url":null,"abstract":"<p><p>The evaluation of drug-gene-disease interactions is key for the identification of drugs effective against disease. However, at present, drugs that are effective against genes that are critical for disease are difficult to identify. Following a disease-centric approach, there is a need to identify genes critical to disease function and find drugs that are effective against them. By contrast, following a drug-centric approach comprises identifying the genes targeted by drugs, and then the diseases in which the identified genes are critical. Both of these processes are complex. Using a gene-centric approach, whereby we identify genes that are effective against the disease and can be targeted by drugs, is much easier. However, how such sets of genes can be identified without specifying either the target diseases or drugs is not known. In this study, a novel artificial intelligence-based approach that employs unsupervised methods and identifies genes without specifying neither diseases nor drugs is presented. To evaluate its feasibility, we applied tensor decomposition (TD)-based unsupervised feature extraction (FE) to perform drug repositioning from protein-protein interactions (PPI) without any other information. Proteins selected by TD-based unsupervised FE include many genes related to cancers, as well as drugs that target the selected proteins. Thus, we were able to identify cancer drugs using only PPI. Because the selected proteins had more interactions, we replaced the selected proteins with hub proteins and found that hub proteins themselves could be used for drug repositioning. In contrast to hub proteins, which can only identify cancer drugs, TD-based unsupervised FE enables the identification of drugs for other diseases. In addition, TD-based unsupervised FE can be used to identify drugs that are effective in in vivo experiments, which is difficult when hub proteins are used. In conclusion, TD-based unsupervised FE is a useful tool for drug repositioning using only PPI without other information.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"377"},"PeriodicalIF":2.9,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142852296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CNVizard-a lightweight streamlit application for an interactive analysis of copy number variants.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2024-12-17 DOI: 10.1186/s12859-024-06010-2
Jeremias Krause, Carlos Classen, Daniela Dey, Eva Lausberg, Luise Kessler, Thomas Eggermann, Ingo Kurth, Matthias Begemann, Florian Kraft
{"title":"CNVizard-a lightweight streamlit application for an interactive analysis of copy number variants.","authors":"Jeremias Krause, Carlos Classen, Daniela Dey, Eva Lausberg, Luise Kessler, Thomas Eggermann, Ingo Kurth, Matthias Begemann, Florian Kraft","doi":"10.1186/s12859-024-06010-2","DOIUrl":"https://doi.org/10.1186/s12859-024-06010-2","url":null,"abstract":"<p><strong>Background: </strong>Methods to call, analyze and visualize copy number variations (CNVs) from massive parallel sequencing data have been widely adopted in clinical practice and genetic research. To enable a streamlined analysis of CNV data, comprehensive annotations and good visualizations are indispensable. The ability to detect single exon CNVs is another important feature for genetic testing. Nonetheless, most available open-source tools come with limitations in at least one of these areas. One additional drawback is that available tools deliver data in an unstructured and static format which requires subsequent visualization and formatting efforts.</p><p><strong>Results: </strong>Here we present CNVizard, an interactive Streamlit app allowing a comprehensive visualization of CNVkit data. Furthermore, combining CNVizard with the CNVand pipeline allows the annotation and visualization of CNV or SV VCF files from any CNV caller.</p><p><strong>Conclusion: </strong>CNVizard, in combination with CNVand, enables the comprehensive and streamlined analysis of short- and long-read sequencing data and provide an intuitive webapp-like experience enabling an interactive visualization of CNV data.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"376"},"PeriodicalIF":2.9,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142845685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MTAF-DTA: multi-type attention fusion network for drug-target affinity prediction.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2024-12-05 DOI: 10.1186/s12859-024-05984-3
Jinghong Sun, Han Wang, Jia Mi, Jing Wan, Jingyang Gao
{"title":"MTAF-DTA: multi-type attention fusion network for drug-target affinity prediction.","authors":"Jinghong Sun, Han Wang, Jia Mi, Jing Wan, Jingyang Gao","doi":"10.1186/s12859-024-05984-3","DOIUrl":"10.1186/s12859-024-05984-3","url":null,"abstract":"<p><strong>Background: </strong>The development of drug-target binding affinity (DTA) prediction tasks significantly drives the drug discovery process forward. Leveraging the rapid advancement of artificial intelligence, DTA prediction tasks have undergone a transformative shift from wet lab experimentation to machine learning-based prediction. This transition enables a more expedient exploration of potential interactions between drugs and targets, leading to substantial savings in time and funding resources. However, existing methods still face several challenges, such as drug information loss, lack of calculation of the contribution of each modality, and lack of simulation regarding the drug-target binding mechanisms.</p><p><strong>Results: </strong>We propose MTAF-DTA, a method for drug-target binding affinity prediction to solve the above problems. The drug representation module extracts three modalities of features from drugs and uses an attention mechanism to update their respective contribution weights. Additionally, we design a Spiral-Attention Block (SAB) as drug-target feature fusion module based on multi-type attention mechanisms, facilitating a triple fusion process between them. The SAB, to some extent, simulates the interactions between drugs and targets, thereby enabling outstanding performance in the DTA task. Our regression task on the Davis and KIBA datasets demonstrates the predictive capability of MTAF-DTA, with CI and MSE metrics showing respective improvements of 1.1% and 9.2% over the state-of-the-art (SOTA) method in the novel target settings. Furthermore, downstream tasks further validate MTAF-DTA's superiority in DTA prediction.</p><p><strong>Conclusions: </strong>Experimental results and case study demonstrate the superior performance of our approach in DTA prediction tasks, showing its potential in practical applications such as drug discovery and disease treatment.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"375"},"PeriodicalIF":2.9,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622562/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142784054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pheno-Ranker: a toolkit for comparison of phenotypic data stored in GA4GH standards and beyond.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2024-12-04 DOI: 10.1186/s12859-024-05993-2
Ivo C Leist, María Rivas-Torrubia, Marta E Alarcón-Riquelme, Guillermo Barturen, Precisesads Clinical Consortium, Ivo G Gut, Manuel Rueda
{"title":"Pheno-Ranker: a toolkit for comparison of phenotypic data stored in GA4GH standards and beyond.","authors":"Ivo C Leist, María Rivas-Torrubia, Marta E Alarcón-Riquelme, Guillermo Barturen, Precisesads Clinical Consortium, Ivo G Gut, Manuel Rueda","doi":"10.1186/s12859-024-05993-2","DOIUrl":"10.1186/s12859-024-05993-2","url":null,"abstract":"<p><strong>Background: </strong>Phenotypic data comparison is essential for disease association studies, patient stratification, and genotype-phenotype correlation analysis. To support these efforts, the Global Alliance for Genomics and Health (GA4GH) established Phenopackets v2 and Beacon v2 standards for storing, sharing, and discovering genomic and phenotypic data. These standards provide a consistent framework for organizing biological data, simplifying their transformation into computer-friendly formats. However, matching participants using GA4GH-based formats remains challenging, as current methods are not fully compatible, limiting their effectiveness.</p><p><strong>Results: </strong>Here, we introduce Pheno-Ranker, an open-source software toolkit for individual-level comparison of phenotypic data. As input, it accepts JSON/YAML data exchange formats from Beacon v2 and Phenopackets v2 data models, as well as any data structure encoded in JSON, YAML, or CSV formats. Internally, the hierarchical data structure is flattened to one dimension and then transformed through one-hot encoding. This allows for efficient pairwise (all-to-all) comparisons within cohorts or for matching of a patient's profile in cohorts. Users have the flexibility to refine their comparisons by including or excluding terms, applying weights to variables, and obtaining statistical significance through Z-scores and p-values. The output consists of text files, which can be further analyzed using unsupervised learning techniques, such as clustering or multidimensional scaling (MDS), and with graph analytics. Pheno-Ranker's performance has been validated with simulated and synthetic data, showing its accuracy, robustness, and efficiency across various health data scenarios. A real data use case from the PRECISESADS study highlights its practical utility in clinical research.</p><p><strong>Conclusions: </strong>Pheno-Ranker is a user-friendly, lightweight software for semantic similarity analysis of phenotypic data in Beacon v2 and Phenopackets v2 formats, extendable to other data types. It enables the comparison of a wide range of variables beyond HPO or OMIM terms while preserving full context. The software is designed as a command-line tool with additional utilities for CSV import, data simulation, summary statistics plotting, and QR code generation. For interactive analysis, it also includes a web-based user interface built with R Shiny. Links to the online documentation, including a Google Colab tutorial, and the tool's source code are available on the project home page: https://github.com/CNAG-Biomedical-Informatics/pheno-ranker .</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"373"},"PeriodicalIF":2.9,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11616229/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142779296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction: Maptcha: an efficient parallel workflow for hybrid genome scaffolding.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2024-12-04 DOI: 10.1186/s12859-024-05957-6
Oieswarya Bhowmik, Tazin Rahman, Ananth Kalyanaraman
{"title":"Correction: Maptcha: an efficient parallel workflow for hybrid genome scaffolding.","authors":"Oieswarya Bhowmik, Tazin Rahman, Ananth Kalyanaraman","doi":"10.1186/s12859-024-05957-6","DOIUrl":"10.1186/s12859-024-05957-6","url":null,"abstract":"","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"374"},"PeriodicalIF":2.9,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11616244/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142779293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Managing false positives during detection of pathogen sequences in shotgun metagenomics datasets.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2024-12-03 DOI: 10.1186/s12859-024-05952-x
Lauren M Bradford, Catherine Carrillo, Alex Wong
{"title":"Managing false positives during detection of pathogen sequences in shotgun metagenomics datasets.","authors":"Lauren M Bradford, Catherine Carrillo, Alex Wong","doi":"10.1186/s12859-024-05952-x","DOIUrl":"10.1186/s12859-024-05952-x","url":null,"abstract":"<p><strong>Background: </strong>Culture-independent diagnostic tests are gaining popularity as tools for detecting pathogens in food. Shotgun sequencing holds substantial promise for food testing as it provides abundant information on microbial communities, but the challenge is in analyzing large and complex sequencing datasets with a high degree of both sensitivity and specificity. Falsely classifying sequencing reads as originating from pathogens can lead to unnecessary food recalls or production shutdowns, while low sensitivity resulting in false negatives could lead to preventable illness.</p><p><strong>Results: </strong>We used simulated and published shotgun sequencing datasets containing Salmonella-derived reads to explore the appearance and mitigation of false positive results using the popular taxonomic annotation softwares Kraken2 and Metaphlan4. Using default parameters, Kraken2 is sensitive but prone to false positives, while Metaphlan4 is more specific but unable to detect Salmonella at low abundance. We then developed a bioinformatic pipeline for identifying and removing reads falsely identified as Salmonella by Kraken2 while retaining high sensitivity. Carefully considering software parameters and database choices is essential to avoiding false positive sample calls. With well-chosen parameters plus additional steps to confirm the taxonomic origin of reads, it is possible to detect pathogens with very high specificity and sensitivity.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"372"},"PeriodicalIF":2.9,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11613480/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142765864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The evaluation of transcription factor binding site prediction tools in human and Arabidopsis genomes.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2024-12-02 DOI: 10.1186/s12859-024-05995-0
Dinithi V Wanniarachchi, Sameera Viswakula, Anushka M Wickramasuriya
{"title":"The evaluation of transcription factor binding site prediction tools in human and Arabidopsis genomes.","authors":"Dinithi V Wanniarachchi, Sameera Viswakula, Anushka M Wickramasuriya","doi":"10.1186/s12859-024-05995-0","DOIUrl":"10.1186/s12859-024-05995-0","url":null,"abstract":"<p><strong>Background: </strong>The precise prediction of transcription factor binding sites (TFBSs) is pivotal for unraveling the gene regulatory networks underlying biological processes. While numerous tools have emerged for in silico TFBS prediction in recent years, the evolving landscape of computational biology necessitates thorough assessments of tool performance to ensure accuracy and reliability. Only a limited number of studies have been conducted to evaluate the performance of TFBS prediction tools comprehensively. Thus, the present study focused on assessing twelve widely used TFBS prediction tools and four de novo motif discovery tools using a benchmark dataset comprising real, generic, Markov, and negative sequences. TFBSs of Arabidopsis thaliana and Homo sapiens genomes downloaded from the JASPAR database were implanted in these sequences and the performance of tools was evaluated using several statistical parameters at different overlap percentages between the lengths of known and predicted binding sites.</p><p><strong>Results: </strong>Overall, the Multiple Cluster Alignment and Search Tool (MCAST) emerged as the best TFBS prediction tool, followed by Find Individual Motif Occurrences (FIMO) and MOtif Occurrence Detection Suite (MOODS). In addition, MotEvo and Dinucleotide Weight Tensor Toolbox (DWT-toolbox) demonstrated the highest sensitivity in identifying TFBSs at 90% and 80% overlap. Further, MCAST and DWT-toolbox managed to demonstrate the highest sensitivity across all three data types real, generic, and Markov. Among the de novo motif discovery tools, the Multiple Em for Motif Elicitation (MEME) emerged as the best performer. An analysis of the promoter regions of genes involved in the anthocyanin biosynthesis pathway in plants and the pentose phosphate pathway in humans, using the three best-performing tools, revealed considerable variation among the top 20 motifs identified by these tools.</p><p><strong>Conclusion: </strong>The findings of this study lay a robust groundwork for selecting optimal TFBS prediction tools for future research. Given the variability observed in tool performance, employing multiple tools for identifying TFBSs in a set of sequences is highly recommended. In addition, further studies are recommended to develop an integrated toolbox that incorporates TFBS prediction or motif discovery tools, aiming to streamline result precision and accuracy.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"371"},"PeriodicalIF":2.9,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11613939/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142765866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single-character insertion-deletion model preserves long indels in ancestral sequence reconstruction.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2024-12-02 DOI: 10.1186/s12859-024-05986-1
Gholamhossein Jowkar, Jūlija Pečerska, Manuel Gil, Maria Anisimova
{"title":"Single-character insertion-deletion model preserves long indels in ancestral sequence reconstruction.","authors":"Gholamhossein Jowkar, Jūlija Pečerska, Manuel Gil, Maria Anisimova","doi":"10.1186/s12859-024-05986-1","DOIUrl":"https://doi.org/10.1186/s12859-024-05986-1","url":null,"abstract":"<p><p>Insertions and deletions (indels) play a significant role in genome evolution across species. Realistic modelling of indel evolution is challenging and is still an open research question. Several attempts have been made to explicitly model multi-character (long) indels, such as TKF92, by relaxing the site independence assumption and introducing fragments. However, these methods are computationally expensive. On the other hand, the Poisson Indel Process (PIP) assumes site independence but allows one to infer single-character indels on the phylogenetic tree, distinguishing insertions from deletions. PIP's marginal likelihood computation has linear time complexity, enabling ancestral sequence reconstruction (ASR) with indels in linear time. Recently, we developed ARPIP, an ASR method using PIP, capable of inferring indel events with explicit evolutionary interpretations. Here, we investigate the effect of the single-character indel assumption on reconstructed ancestral sequences on mammalian protein orthologs and on simulated data. We show that ARPIP's ancestral estimates preserve the gap length distribution observed in the input alignment. In mammalian proteins the lengths of inserted segments appear to be substantially longer compared to deleted segments. Further, we confirm the well-established deletion bias observed in real data. To date, ARPIP is the only ancestral reconstruction method that explicitly models insertion and deletion events over time. Given a good quality input alignment, it can capture ancestral long indel events on the phylogeny.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"370"},"PeriodicalIF":2.9,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11610121/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142765865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel phenotype imputation method with copula model.
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2024-11-30 DOI: 10.1186/s12859-024-05990-5
Jianjun Zhang, Jane Zizhen Zhao, Samantha Gonzales, Xuexia Wang, Qiuying Sha
{"title":"A novel phenotype imputation method with copula model.","authors":"Jianjun Zhang, Jane Zizhen Zhao, Samantha Gonzales, Xuexia Wang, Qiuying Sha","doi":"10.1186/s12859-024-05990-5","DOIUrl":"https://doi.org/10.1186/s12859-024-05990-5","url":null,"abstract":"<p><strong>Background: </strong>Jointly analyzing multiple phenotype/traits may increase power in genetic association studies by aggregating weak genetic effects. The chance that at least one phenotype is missing increases exponentially as the number of phenotype increases especially for a real dataset. It is a common practice to discard individuals with missing phenotype or phenotype with a large proportion of missing values. Such a discarding method may lead to a loss of power or even an insufficient sample size for analysis. To our knowledge, many existing phenotype imputing methods are built on multivariate normal assumptions for analysis. Violation of these assumptions may lead to inflated type I errors or even loss of power in some cases. To overcome these limitations, we propose a novel phenotype imputation method based on a new Gaussian copula model with three different loss functions to address the issue of missing phenotype.</p><p><strong>Results: </strong>In a variety of simulations and a real genetic association study for lung function, we show that our method outperforms existing methods and can also increase the power of the association test when compared to other comparable phenotype imputation methods. The proposed method is implemented in an R package available at https://github.com/jane-zizhen-zhao/CopulaPhenoImpute1.0 CONCLUSIONS: We propose a novel phenotype imputation method with a new Gaussian copula model based on three loss functions. Results of the simulation studies and real data analyses illustrate that the proposed method outperforms comparable methods.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"369"},"PeriodicalIF":2.9,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11607873/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142765791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced prediction of hemolytic activity in antimicrobial peptides using deep learning-based sequence analysis. 利用基于深度学习的序列分析增强对抗菌肽溶血活性的预测。
IF 2.9 3区 生物学
BMC Bioinformatics Pub Date : 2024-11-27 DOI: 10.1186/s12859-024-05983-4
Ibrahim Abdelbaky, Mohamed Elhakeem, Hilal Tayara, Elsayed Badr, Mustafa Abdul Salam
{"title":"Enhanced prediction of hemolytic activity in antimicrobial peptides using deep learning-based sequence analysis.","authors":"Ibrahim Abdelbaky, Mohamed Elhakeem, Hilal Tayara, Elsayed Badr, Mustafa Abdul Salam","doi":"10.1186/s12859-024-05983-4","DOIUrl":"10.1186/s12859-024-05983-4","url":null,"abstract":"<p><p>Antimicrobial peptides (AMPs) are a promising class of antimicrobial drugs due to their broad-spectrum activity against microorganisms. However, their clinical application is limited by their potential to cause hemolysis, the destruction of red blood cells. To address this issue, we propose a deep learning model based on convolutional neural networks (CNNs) for predicting the hemolytic activity of AMPs. Peptide sequences are represented using one-hot encoding, and the CNN architecture consists of multiple convolutional and fully connected layers. The model was trained on six different datasets: HemoPI-1, HemoPI-2, HemoPI-3, RNN-Hem, Hlppredfuse, and AMP-Combined, achieving Matthew's correlation coefficients of 0.9274, 0.5614, 0.6051, 0.6142, 0.8799, and 0.7484, respectively. Our model outperforms previously reported methods and can facilitate the development of novel AMPs with reduced hemolytic activity, which is crucial for their therapeutic use in treating bacterial infections.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"368"},"PeriodicalIF":2.9,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11603801/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142738191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信