M. Fernanda Vieira , José Duarte , Rita Domingues , Hugo Oliveira , Oscar Dias
{"title":"PhageDPO:一个基于机器学习的计算框架,用于识别噬菌体解聚合酶","authors":"M. Fernanda Vieira , José Duarte , Rita Domingues , Hugo Oliveira , Oscar Dias","doi":"10.1016/j.compbiomed.2025.109836","DOIUrl":null,"url":null,"abstract":"<div><div>Bacteriophages (phages) are the most predominant and genetically diverse biological entities on Earth. Phages are viruses that infect bacteria and encode numerous proteins with potential biotechnological application. However, most phage-encoded proteins remain functionally uncharacterized. Depolymerases (DPOs) in particular, enzymes that degrade external polysaccharide structures, have garnered increasing interest from both fundamental research standpoint and for biotechnological applications to control bacterial pathogens. Despite the proliferation of identification tools for predicting DPOs in phage genomes, we introduced <em>PhageDPO</em> as a robust and reliable solution. <em>PhageDPO</em> is trained on a comprehensive dataset that includes sequences related to seven specific DPO-related domains, completed with DPOs validated in the literature. Training a Support Vector Machine (SVM) model resulted in a test accuracy of 96 %, a recall of 97 %, a precision of 94 % and a F1-score of 96 %, demonstrating its capability in predicting DPOs in phage genomes. The model was further validated using both cases reported in the literature and newly generated data for this study, enhancing its performance. Beyond its predictive performance, <em>PhageDPO</em> distinguishes itself by offering a user-friendly interface coupled with robust performance, making it more accessible and effective compared to other tools with graphical interfaces.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"188 ","pages":"Article 109836"},"PeriodicalIF":6.3000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PhageDPO: A machine-learning based computational framework for identifying phage depolymerases\",\"authors\":\"M. Fernanda Vieira , José Duarte , Rita Domingues , Hugo Oliveira , Oscar Dias\",\"doi\":\"10.1016/j.compbiomed.2025.109836\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Bacteriophages (phages) are the most predominant and genetically diverse biological entities on Earth. Phages are viruses that infect bacteria and encode numerous proteins with potential biotechnological application. However, most phage-encoded proteins remain functionally uncharacterized. Depolymerases (DPOs) in particular, enzymes that degrade external polysaccharide structures, have garnered increasing interest from both fundamental research standpoint and for biotechnological applications to control bacterial pathogens. Despite the proliferation of identification tools for predicting DPOs in phage genomes, we introduced <em>PhageDPO</em> as a robust and reliable solution. <em>PhageDPO</em> is trained on a comprehensive dataset that includes sequences related to seven specific DPO-related domains, completed with DPOs validated in the literature. Training a Support Vector Machine (SVM) model resulted in a test accuracy of 96 %, a recall of 97 %, a precision of 94 % and a F1-score of 96 %, demonstrating its capability in predicting DPOs in phage genomes. The model was further validated using both cases reported in the literature and newly generated data for this study, enhancing its performance. Beyond its predictive performance, <em>PhageDPO</em> distinguishes itself by offering a user-friendly interface coupled with robust performance, making it more accessible and effective compared to other tools with graphical interfaces.</div></div>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"188 \",\"pages\":\"Article 109836\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-02-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010482525001866\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525001866","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
PhageDPO: A machine-learning based computational framework for identifying phage depolymerases
Bacteriophages (phages) are the most predominant and genetically diverse biological entities on Earth. Phages are viruses that infect bacteria and encode numerous proteins with potential biotechnological application. However, most phage-encoded proteins remain functionally uncharacterized. Depolymerases (DPOs) in particular, enzymes that degrade external polysaccharide structures, have garnered increasing interest from both fundamental research standpoint and for biotechnological applications to control bacterial pathogens. Despite the proliferation of identification tools for predicting DPOs in phage genomes, we introduced PhageDPO as a robust and reliable solution. PhageDPO is trained on a comprehensive dataset that includes sequences related to seven specific DPO-related domains, completed with DPOs validated in the literature. Training a Support Vector Machine (SVM) model resulted in a test accuracy of 96 %, a recall of 97 %, a precision of 94 % and a F1-score of 96 %, demonstrating its capability in predicting DPOs in phage genomes. The model was further validated using both cases reported in the literature and newly generated data for this study, enhancing its performance. Beyond its predictive performance, PhageDPO distinguishes itself by offering a user-friendly interface coupled with robust performance, making it more accessible and effective compared to other tools with graphical interfaces.
期刊介绍:
Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.