{"title":"DeepMineLys: Deep mining of phage lysins from human microbiome.","authors":"Yiran Fu, Shuting Yu, Jianfeng Li, Zisha Lao, Xiaofeng Yang, Zhanglin Lin","doi":"10.1016/j.celrep.2024.114583","DOIUrl":null,"url":null,"abstract":"<p><p>Vast shotgun metagenomics data remain an underutilized resource for novel enzymes. Artificial intelligence (AI) has increasingly been applied to protein mining, but its conventional performance evaluation is interpolative in nature, and these trained models often struggle to extrapolate effectively when challenged with unknown data. In this study, we present a framework (DeepMineLys [deep mining of phage lysins from human microbiome]) based on the convolutional neural network (CNN) to identify phage lysins from three human microbiome datasets. When validated with an independent dataset, our method achieved an F1-score of 84.00%, surpassing existing methods by 20.84%. We expressed 16 lysin candidates from the top 100 sequences in E. coli, confirming 11 as active. The best one displayed an activity 6.2-fold that of lysozyme derived from hen egg white, establishing it as the most potent lysin from the human microbiome. Our study also underscores several important issues when applying AI to biology questions. This framework should be applicable for mining other proteins.</p>","PeriodicalId":9798,"journal":{"name":"Cell reports","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell reports","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.celrep.2024.114583","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/6 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Vast shotgun metagenomics data remain an underutilized resource for novel enzymes. Artificial intelligence (AI) has increasingly been applied to protein mining, but its conventional performance evaluation is interpolative in nature, and these trained models often struggle to extrapolate effectively when challenged with unknown data. In this study, we present a framework (DeepMineLys [deep mining of phage lysins from human microbiome]) based on the convolutional neural network (CNN) to identify phage lysins from three human microbiome datasets. When validated with an independent dataset, our method achieved an F1-score of 84.00%, surpassing existing methods by 20.84%. We expressed 16 lysin candidates from the top 100 sequences in E. coli, confirming 11 as active. The best one displayed an activity 6.2-fold that of lysozyme derived from hen egg white, establishing it as the most potent lysin from the human microbiome. Our study also underscores several important issues when applying AI to biology questions. This framework should be applicable for mining other proteins.
期刊介绍:
Cell Reports publishes high-quality research across the life sciences and focuses on new biological insight as its primary criterion for publication. The journal offers three primary article types: Reports, which are shorter single-point articles, research articles, which are longer and provide deeper mechanistic insights, and resources, which highlight significant technical advances or major informational datasets that contribute to biological advances. Reviews covering recent literature in emerging and active fields are also accepted.
The Cell Reports Portfolio includes gold open-access journals that cover life, medical, and physical sciences, and its mission is to make cutting-edge research and methodologies available to a wide readership.
The journal's professional in-house editors work closely with authors, reviewers, and the scientific advisory board, which consists of current and future leaders in their respective fields. The advisory board guides the scope, content, and quality of the journal, but editorial decisions are independently made by the in-house scientific editors of Cell Reports.