{"title":"Effective identification and differential analysis of anticancer peptides","authors":"Lichao Zhang , Xueli Hu , Kang Xiao , Liang Kong","doi":"10.1016/j.biosystems.2024.105246","DOIUrl":null,"url":null,"abstract":"<div><p>Anticancer peptides (ACPs) have recently emerged as promising cancer therapeutics due to their selectivity and lower toxicity. However, the number of experimentally validated ACPs is limited, and identifying ACPs from large-scale sequence data is time-consuming and expensive. Therefore, it is critical to develop and improve upon existing computational models for identifying ACPs. In this study, a computational method named ACP_DA was proposed based on peptide residue composition and physiochemical properties information. To curtail overfitting and reduce computational costs, a sequential forward selection method was utilized to construct the optimal feature groups. Subsequently, the feature vectors were fed into light gradient boosting machine classifier for model construction. It was observed by an independent set test that ACP_DA achieved the highest Matthew's correlation coefficient of 0.63 and accuracy of 0.8129, displaying at least a 2% enhancement compared to state-of-the-art methods. The satisfactory results demonstrate the effectiveness of ACP_DA as a powerful tool for identifying ACPs, with the potential to significantly contribute to the development and optimization of promising therapies. The data and resource codes are available at <span>https://github.com/Zlclab/ACP_DA</span><svg><path></path></svg>.</p></div>","PeriodicalId":50730,"journal":{"name":"Biosystems","volume":"241 ","pages":"Article 105246"},"PeriodicalIF":2.0000,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biosystems","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030326472400131X","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Anticancer peptides (ACPs) have recently emerged as promising cancer therapeutics due to their selectivity and lower toxicity. However, the number of experimentally validated ACPs is limited, and identifying ACPs from large-scale sequence data is time-consuming and expensive. Therefore, it is critical to develop and improve upon existing computational models for identifying ACPs. In this study, a computational method named ACP_DA was proposed based on peptide residue composition and physiochemical properties information. To curtail overfitting and reduce computational costs, a sequential forward selection method was utilized to construct the optimal feature groups. Subsequently, the feature vectors were fed into light gradient boosting machine classifier for model construction. It was observed by an independent set test that ACP_DA achieved the highest Matthew's correlation coefficient of 0.63 and accuracy of 0.8129, displaying at least a 2% enhancement compared to state-of-the-art methods. The satisfactory results demonstrate the effectiveness of ACP_DA as a powerful tool for identifying ACPs, with the potential to significantly contribute to the development and optimization of promising therapies. The data and resource codes are available at https://github.com/Zlclab/ACP_DA.
期刊介绍:
BioSystems encourages experimental, computational, and theoretical articles that link biology, evolutionary thinking, and the information processing sciences. The link areas form a circle that encompasses the fundamental nature of biological information processing, computational modeling of complex biological systems, evolutionary models of computation, the application of biological principles to the design of novel computing systems, and the use of biomolecular materials to synthesize artificial systems that capture essential principles of natural biological information processing.