{"title":"Predicting symptoms of downy mildew, powdery mildew, and gray mold diseases of grapevine through machine learning","authors":"I. Volpi, D. Guidotti, Michele Mammini, S. Marchi","doi":"10.36253/ijam-1131","DOIUrl":null,"url":null,"abstract":"Downy mildew, powdery mildew, and gray mold are major diseases of grapevine with a strong negative impact on fruit yield and fruit quality. These diseases are controlled by the application of chemicals, which may cause undesirable effects on the environment and on human health. Thus, monitoring and forecasting crop disease is essential to support integrated pest management (IPM) measures. In this study, two tree-based machine learning (ML) algorithms, random forest and C5.0, were compared to test their capability to predict the appearance of symptoms of grapevine diseases, considering meteorological conditions, spatial indices, the number of crop protection treatments and the frequency of monitoring days in which symptoms were recorded in the previous year. Data collected in Tuscany region (Italy), on the presence of symptoms on grapevine, from 2006 to 2017 were divided with an 80/20 proportion in training and test set, data collected in 2018 and 2019 were tested as independent years for downy mildew and powdery mildew. The frequency of symptoms in the previous year and the cumulative precipitation from April to seven days before the monitoring day were the most important variables among those considered in the analysis for predicting the occurrence of disease symptoms. The best performance in predicting the presence of symptoms of the three diseases was obtained with the algorithm C5.0 by applying (i) a technique to deal with imbalanced dataset (i.e., symptoms were detected in the minority of observations) and (ii) an optimized cut-off for predictions. The balanced accuracy achieved in the test set was 0.8 for downy mildew, 0.7 for powdery mildew and 0.9 for gray mold. The application of the models for downy mildew and powdery mildew in the two independent years (2018 and 2019) achieved a lower balanced accuracy, around 0.7 for both the diseases. Machine learning models were able to select the best predictors and to unravel the complex relationships among geographic indices, bioclimatic indices, protection treatments and the frequency of symptoms in the previous year. ","PeriodicalId":54371,"journal":{"name":"Italian Journal of Agrometeorology-Rivista Italiana Di Agrometeorologia","volume":"1 1","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2021-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Italian Journal of Agrometeorology-Rivista Italiana Di Agrometeorologia","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.36253/ijam-1131","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 5
Abstract
Downy mildew, powdery mildew, and gray mold are major diseases of grapevine with a strong negative impact on fruit yield and fruit quality. These diseases are controlled by the application of chemicals, which may cause undesirable effects on the environment and on human health. Thus, monitoring and forecasting crop disease is essential to support integrated pest management (IPM) measures. In this study, two tree-based machine learning (ML) algorithms, random forest and C5.0, were compared to test their capability to predict the appearance of symptoms of grapevine diseases, considering meteorological conditions, spatial indices, the number of crop protection treatments and the frequency of monitoring days in which symptoms were recorded in the previous year. Data collected in Tuscany region (Italy), on the presence of symptoms on grapevine, from 2006 to 2017 were divided with an 80/20 proportion in training and test set, data collected in 2018 and 2019 were tested as independent years for downy mildew and powdery mildew. The frequency of symptoms in the previous year and the cumulative precipitation from April to seven days before the monitoring day were the most important variables among those considered in the analysis for predicting the occurrence of disease symptoms. The best performance in predicting the presence of symptoms of the three diseases was obtained with the algorithm C5.0 by applying (i) a technique to deal with imbalanced dataset (i.e., symptoms were detected in the minority of observations) and (ii) an optimized cut-off for predictions. The balanced accuracy achieved in the test set was 0.8 for downy mildew, 0.7 for powdery mildew and 0.9 for gray mold. The application of the models for downy mildew and powdery mildew in the two independent years (2018 and 2019) achieved a lower balanced accuracy, around 0.7 for both the diseases. Machine learning models were able to select the best predictors and to unravel the complex relationships among geographic indices, bioclimatic indices, protection treatments and the frequency of symptoms in the previous year.
期刊介绍:
Among the areas of specific interest of the journal there are: ecophysiology; phenology; plant growth, quality and quantity of production; plant pathology; entomology; welfare conditions of livestocks; soil physics and hydrology; micrometeorology; modeling, simulation and forecasting; remote sensing; territorial planning; geographical information systems and spatialization techniques; instrumentation to measure physical and biological quantities; data validation techniques, agroclimatology; agriculture scientific dissemination; support services for farmers.