{"title":"将 Phytophthora infestans 基因型自动分类为克隆系的机器学习算法","authors":"Camilo Patarroyo, Stéphane Dupas, Silvia Restrepo","doi":"10.1002/aps3.11603","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Premise</h3>\n \n <p>The prompt categorization of <i>Phytophthora infestans</i> isolates into described clonal lineages is a key tool for the management of its associated disease, potato late blight. New isolates of this pathogen are currently classified by comparing their microsatellite genotypes with characterized clonal lineages, but an automated classification tool would greatly improve this process. Here, we developed a flexible machine learning–based classifier for <i>P. infestans</i> genotypes.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>The performance of different machine learning algorithms in classifying <i>P. infestans</i> genotypes into its clonal lineages was preliminarily evaluated with decreasing amounts of training data. The four best algorithms were then evaluated using all collected genotypes.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>mlpML, cforest, nnet, and AdaBag performed best in the preliminary test, correctly classifying almost 100% of the genotypes. AdaBag performed significantly better than the others when tested using the complete data set (Tukey HSD <i>P</i> < 0.001). This algorithm was then implemented in a web application for the automated classification of <i>P. infestans</i> genotypes, which is freely available at https://github.com/cpatarroyo/genotypeclas.</p>\n </section>\n \n <section>\n \n <h3> Discussion</h3>\n \n <p>We developed a gradient boosting–based tool to automatically classify <i>P. infestans</i> genotypes into its clonal lineages. This could become a valuable resource for the prompt identification of clonal lineages spreading into new regions.</p>\n </section>\n </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.11603","citationCount":"0","resultStr":"{\"title\":\"A machine learning algorithm for the automatic classification of Phytophthora infestans genotypes into clonal lineages\",\"authors\":\"Camilo Patarroyo, Stéphane Dupas, Silvia Restrepo\",\"doi\":\"10.1002/aps3.11603\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Premise</h3>\\n \\n <p>The prompt categorization of <i>Phytophthora infestans</i> isolates into described clonal lineages is a key tool for the management of its associated disease, potato late blight. New isolates of this pathogen are currently classified by comparing their microsatellite genotypes with characterized clonal lineages, but an automated classification tool would greatly improve this process. Here, we developed a flexible machine learning–based classifier for <i>P. infestans</i> genotypes.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>The performance of different machine learning algorithms in classifying <i>P. infestans</i> genotypes into its clonal lineages was preliminarily evaluated with decreasing amounts of training data. The four best algorithms were then evaluated using all collected genotypes.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>mlpML, cforest, nnet, and AdaBag performed best in the preliminary test, correctly classifying almost 100% of the genotypes. AdaBag performed significantly better than the others when tested using the complete data set (Tukey HSD <i>P</i> < 0.001). This algorithm was then implemented in a web application for the automated classification of <i>P. infestans</i> genotypes, which is freely available at https://github.com/cpatarroyo/genotypeclas.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Discussion</h3>\\n \\n <p>We developed a gradient boosting–based tool to automatically classify <i>P. infestans</i> genotypes into its clonal lineages. This could become a valuable resource for the prompt identification of clonal lineages spreading into new regions.</p>\\n </section>\\n </div>\",\"PeriodicalId\":8022,\"journal\":{\"name\":\"Applications in Plant Sciences\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.11603\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applications in Plant Sciences\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/aps3.11603\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PLANT SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applications in Plant Sciences","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aps3.11603","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PLANT SCIENCES","Score":null,"Total":0}
引用次数: 0
摘要
前言将侵染病菌分离物迅速归类到描述的克隆系中是管理其相关病害马铃薯晚疫病的关键工具。目前,这种病原体的新分离物是通过比较其微卫星基因型和特征克隆系来进行分类的,但自动分类工具将大大改进这一过程。在此,我们为 P. infestans 基因型开发了一种灵活的基于机器学习的分类器。方法通过减少训练数据量,初步评估了不同机器学习算法在将 P. infestans 基因型归入其克隆系方面的性能。结果mlpML、cforest、nnet 和 AdaBag 在初步测试中表现最佳,几乎 100% 正确地对基因型进行了分类。在使用完整数据集进行测试时,AdaBag 的表现明显优于其他算法(Tukey HSD P < 0.001)。该算法随后被应用于一个用于自动分类 P. infestans 基因型的网络应用程序中,该程序可在 https://github.com/cpatarroyo/genotypeclas.DiscussionWe 免费获取。该程序开发了一种基于梯度提升的工具,用于将 P. infestans 基因型自动分类为其克隆系。这将成为迅速识别扩散到新地区的克隆系的宝贵资源。
A machine learning algorithm for the automatic classification of Phytophthora infestans genotypes into clonal lineages
Premise
The prompt categorization of Phytophthora infestans isolates into described clonal lineages is a key tool for the management of its associated disease, potato late blight. New isolates of this pathogen are currently classified by comparing their microsatellite genotypes with characterized clonal lineages, but an automated classification tool would greatly improve this process. Here, we developed a flexible machine learning–based classifier for P. infestans genotypes.
Methods
The performance of different machine learning algorithms in classifying P. infestans genotypes into its clonal lineages was preliminarily evaluated with decreasing amounts of training data. The four best algorithms were then evaluated using all collected genotypes.
Results
mlpML, cforest, nnet, and AdaBag performed best in the preliminary test, correctly classifying almost 100% of the genotypes. AdaBag performed significantly better than the others when tested using the complete data set (Tukey HSD P < 0.001). This algorithm was then implemented in a web application for the automated classification of P. infestans genotypes, which is freely available at https://github.com/cpatarroyo/genotypeclas.
Discussion
We developed a gradient boosting–based tool to automatically classify P. infestans genotypes into its clonal lineages. This could become a valuable resource for the prompt identification of clonal lineages spreading into new regions.
期刊介绍:
Applications in Plant Sciences (APPS) is a monthly, peer-reviewed, open access journal promoting the rapid dissemination of newly developed, innovative tools and protocols in all areas of the plant sciences, including genetics, structure, function, development, evolution, systematics, and ecology. Given the rapid progress today in technology and its application in the plant sciences, the goal of APPS is to foster communication within the plant science community to advance scientific research. APPS is a publication of the Botanical Society of America, originating in 2009 as the American Journal of Botany''s online-only section, AJB Primer Notes & Protocols in the Plant Sciences.
APPS publishes the following types of articles: (1) Protocol Notes describe new methods and technological advancements; (2) Genomic Resources Articles characterize the development and demonstrate the usefulness of newly developed genomic resources, including transcriptomes; (3) Software Notes detail new software applications; (4) Application Articles illustrate the application of a new protocol, method, or software application within the context of a larger study; (5) Review Articles evaluate available techniques, methods, or protocols; (6) Primer Notes report novel genetic markers with evidence of wide applicability.