{"title":"Cycle-ESM: Generation-assisted classification of antifungal peptides using ESM protein language model","authors":"YiMing Wang, Chun Fang","doi":"10.1016/j.compbiolchem.2024.108240","DOIUrl":null,"url":null,"abstract":"<div><div>The rising prevalence of invasive fungal infections and the emergence of antifungal resistance highlight the urgent need for new antifungal medications. Antifungal peptides have emerged as promising alternatives to traditional antimicrobial agents. The identification of natural or synthetic antifungal peptides is crucial for advancing antifungal drug development. Typically, the availability of antifungal samples is limited, and significant sequence diversity exists among antifungal peptides, posing challenges for high-throughput screening. To address the identification challenge of antifungal peptides with limited sample availability, this study introduces the Cycle ESM method. Initially, the method utilises the ESM protein language model to generate additional data on antifungal peptides, serving as a data augmentation technique to enhance model training effectiveness. Subsequently, the ESM is employed in conjunction with a textCNN model to construct a classifier for peptide prediction, with a comprehensive exploration of peptide characteristics to improve prediction accuracy. Experimental results demonstrate that the performance of the Cycle ESM method surpasses that of existing methods across three distinct antifungal peptide datasets. This study presents a novel approach to antifungal peptide prediction and offers innovative insights for addressing classification problems with limited sample availability.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108240"},"PeriodicalIF":2.6000,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Biology and Chemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1476927124002287","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The rising prevalence of invasive fungal infections and the emergence of antifungal resistance highlight the urgent need for new antifungal medications. Antifungal peptides have emerged as promising alternatives to traditional antimicrobial agents. The identification of natural or synthetic antifungal peptides is crucial for advancing antifungal drug development. Typically, the availability of antifungal samples is limited, and significant sequence diversity exists among antifungal peptides, posing challenges for high-throughput screening. To address the identification challenge of antifungal peptides with limited sample availability, this study introduces the Cycle ESM method. Initially, the method utilises the ESM protein language model to generate additional data on antifungal peptides, serving as a data augmentation technique to enhance model training effectiveness. Subsequently, the ESM is employed in conjunction with a textCNN model to construct a classifier for peptide prediction, with a comprehensive exploration of peptide characteristics to improve prediction accuracy. Experimental results demonstrate that the performance of the Cycle ESM method surpasses that of existing methods across three distinct antifungal peptide datasets. This study presents a novel approach to antifungal peptide prediction and offers innovative insights for addressing classification problems with limited sample availability.
期刊介绍:
Computational Biology and Chemistry publishes original research papers and review articles in all areas of computational life sciences. High quality research contributions with a major computational component in the areas of nucleic acid and protein sequence research, molecular evolution, molecular genetics (functional genomics and proteomics), theory and practice of either biology-specific or chemical-biology-specific modeling, and structural biology of nucleic acids and proteins are particularly welcome. Exceptionally high quality research work in bioinformatics, systems biology, ecology, computational pharmacology, metabolism, biomedical engineering, epidemiology, and statistical genetics will also be considered.
Given their inherent uncertainty, protein modeling and molecular docking studies should be thoroughly validated. In the absence of experimental results for validation, the use of molecular dynamics simulations along with detailed free energy calculations, for example, should be used as complementary techniques to support the major conclusions. Submissions of premature modeling exercises without additional biological insights will not be considered.
Review articles will generally be commissioned by the editors and should not be submitted to the journal without explicit invitation. However prospective authors are welcome to send a brief (one to three pages) synopsis, which will be evaluated by the editors.