Fangping Wan, Marcelo D T Torres, Changge Guan, Cesar de la Fuente-Nunez
{"title":"教程:使用机器学习方法挖掘基因组和蛋白质组以发现抗生素的指南。","authors":"Fangping Wan, Marcelo D T Torres, Changge Guan, Cesar de la Fuente-Nunez","doi":"10.1038/s41596-025-01144-w","DOIUrl":null,"url":null,"abstract":"<p><p>Genomes and proteomes constitute a rich reservoir of molecular diversity. However, they have remained underexplored because of a lack of appropriate tools. In recent years, computational approaches have been developed to mine this unexplored biological information, or dark matter, accelerating the discovery of new antibiotic molecules. Such efforts have yielded a wide range of new molecules. These include peptides released via predicted proteolytic cleavage of larger proteins, termed 'encrypted peptides', which have been found to be widespread in nature. Molecules encoded by and translated from small open reading frames within genomic sequences have also been uncovered, further expanding the landscape of bioactive compounds. Here, we discuss computational approaches, including machine learning and artificial intelligence (AI) tools, which have been used to date to identify antimicrobial compounds, with a special emphasis on peptides. We also propose potential avenues for future exploration in this rapidly evolving field. Moreover, we provide an overview of the experimental methods commonly used to validate these computational predictions. We anticipate that efforts combining cutting-edge AI and experimental approaches for biological sequence mining will reveal new insights into host immunity and continue to accelerate discoveries in the fields of antibiotics and infectious diseases.</p>","PeriodicalId":18901,"journal":{"name":"Nature Protocols","volume":" ","pages":""},"PeriodicalIF":13.1000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tutorial: guidelines for the use of machine learning methods to mine genomes and proteomes for antibiotic discovery.\",\"authors\":\"Fangping Wan, Marcelo D T Torres, Changge Guan, Cesar de la Fuente-Nunez\",\"doi\":\"10.1038/s41596-025-01144-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Genomes and proteomes constitute a rich reservoir of molecular diversity. However, they have remained underexplored because of a lack of appropriate tools. In recent years, computational approaches have been developed to mine this unexplored biological information, or dark matter, accelerating the discovery of new antibiotic molecules. Such efforts have yielded a wide range of new molecules. These include peptides released via predicted proteolytic cleavage of larger proteins, termed 'encrypted peptides', which have been found to be widespread in nature. Molecules encoded by and translated from small open reading frames within genomic sequences have also been uncovered, further expanding the landscape of bioactive compounds. Here, we discuss computational approaches, including machine learning and artificial intelligence (AI) tools, which have been used to date to identify antimicrobial compounds, with a special emphasis on peptides. We also propose potential avenues for future exploration in this rapidly evolving field. Moreover, we provide an overview of the experimental methods commonly used to validate these computational predictions. We anticipate that efforts combining cutting-edge AI and experimental approaches for biological sequence mining will reveal new insights into host immunity and continue to accelerate discoveries in the fields of antibiotics and infectious diseases.</p>\",\"PeriodicalId\":18901,\"journal\":{\"name\":\"Nature Protocols\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":13.1000,\"publicationDate\":\"2025-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature Protocols\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1038/s41596-025-01144-w\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Protocols","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s41596-025-01144-w","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
Tutorial: guidelines for the use of machine learning methods to mine genomes and proteomes for antibiotic discovery.
Genomes and proteomes constitute a rich reservoir of molecular diversity. However, they have remained underexplored because of a lack of appropriate tools. In recent years, computational approaches have been developed to mine this unexplored biological information, or dark matter, accelerating the discovery of new antibiotic molecules. Such efforts have yielded a wide range of new molecules. These include peptides released via predicted proteolytic cleavage of larger proteins, termed 'encrypted peptides', which have been found to be widespread in nature. Molecules encoded by and translated from small open reading frames within genomic sequences have also been uncovered, further expanding the landscape of bioactive compounds. Here, we discuss computational approaches, including machine learning and artificial intelligence (AI) tools, which have been used to date to identify antimicrobial compounds, with a special emphasis on peptides. We also propose potential avenues for future exploration in this rapidly evolving field. Moreover, we provide an overview of the experimental methods commonly used to validate these computational predictions. We anticipate that efforts combining cutting-edge AI and experimental approaches for biological sequence mining will reveal new insights into host immunity and continue to accelerate discoveries in the fields of antibiotics and infectious diseases.
期刊介绍:
Nature Protocols focuses on publishing protocols used to address significant biological and biomedical science research questions, including methods grounded in physics and chemistry with practical applications to biological problems. The journal caters to a primary audience of research scientists and, as such, exclusively publishes protocols with research applications. Protocols primarily aimed at influencing patient management and treatment decisions are not featured.
The specific techniques covered encompass a wide range, including but not limited to: Biochemistry, Cell biology, Cell culture, Chemical modification, Computational biology, Developmental biology, Epigenomics, Genetic analysis, Genetic modification, Genomics, Imaging, Immunology, Isolation, purification, and separation, Lipidomics, Metabolomics, Microbiology, Model organisms, Nanotechnology, Neuroscience, Nucleic-acid-based molecular biology, Pharmacology, Plant biology, Protein analysis, Proteomics, Spectroscopy, Structural biology, Synthetic chemistry, Tissue culture, Toxicology, and Virology.