Informed-Learning-Guided Visual Question Answering Model of Crop Disease.

IF 6.4 1区农林科学 Q1 AGRONOMY

Plant Phenomics Pub Date : 2024-12-16 eCollection Date: 2024-01-01 DOI:10.34133/plantphenomics.0277

Yunpeng Zhao, Shansong Wang, Qingtian Zeng, Weijian Ni, Hua Duan, Nengfu Xie, Fengjin Xiao

{"title":"Informed-Learning-Guided Visual Question Answering Model of Crop Disease.","authors":"Yunpeng Zhao, Shansong Wang, Qingtian Zeng, Weijian Ni, Hua Duan, Nengfu Xie, Fengjin Xiao","doi":"10.34133/plantphenomics.0277","DOIUrl":null,"url":null,"abstract":"<p><p>In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image classification and object detection. Consequently, research now focuses on training visual question answering (VQA) models. However, existing studies concentrate on identifying disease species rather than formulating questions that encompass crucial multiattributes. Additionally, model performance is susceptible to the model structure and dataset biases. To address these challenges, we construct the informed-learning-guided VQA model of crop disease (ILCD). ILCD improves model performance by integrating coattention, a multimodal fusion model (MUTAN), and a bias-balancing (BiBa) strategy. To facilitate the investigation of various visual attributes of crop diseases and the determination of disease occurrence stages, we construct a new VQA dataset called the Crop Disease Multi-attribute VQA with Prior Knowledge (CDwPK-VQA). This dataset contains comprehensive information on various visual attributes such as shape, size, status, and color. We expand the dataset by integrating prior knowledge into CDwPK-VQA to address performance challenges. Comparative experiments are conducted by ILCD on the VQA-v2, VQA-CP v2, and CDwPK-VQA datasets, achieving accuracies of 68.90%, 49.75%, and 86.06%, respectively. Ablation experiments are conducted on CDwPK-VQA to evaluate the effectiveness of various modules, including coattention, MUTAN, and BiBa. These experiments demonstrate that ILCD exhibits the highest level of accuracy, performance, and value in the field of agriculture. The source codes can be accessed at https://github.com/SdustZYP/ILCD-master/tree/main.</p>","PeriodicalId":20318,"journal":{"name":"Plant Phenomics","volume":"6 ","pages":"0277"},"PeriodicalIF":6.4000,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11649200/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Phenomics","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.34133/plantphenomics.0277","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}

引用次数: 0

Abstract

In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image classification and object detection. Consequently, research now focuses on training visual question answering (VQA) models. However, existing studies concentrate on identifying disease species rather than formulating questions that encompass crucial multiattributes. Additionally, model performance is susceptible to the model structure and dataset biases. To address these challenges, we construct the informed-learning-guided VQA model of crop disease (ILCD). ILCD improves model performance by integrating coattention, a multimodal fusion model (MUTAN), and a bias-balancing (BiBa) strategy. To facilitate the investigation of various visual attributes of crop diseases and the determination of disease occurrence stages, we construct a new VQA dataset called the Crop Disease Multi-attribute VQA with Prior Knowledge (CDwPK-VQA). This dataset contains comprehensive information on various visual attributes such as shape, size, status, and color. We expand the dataset by integrating prior knowledge into CDwPK-VQA to address performance challenges. Comparative experiments are conducted by ILCD on the VQA-v2, VQA-CP v2, and CDwPK-VQA datasets, achieving accuracies of 68.90%, 49.75%, and 86.06%, respectively. Ablation experiments are conducted on CDwPK-VQA to evaluate the effectiveness of various modules, including coattention, MUTAN, and BiBa. These experiments demonstrate that ILCD exhibits the highest level of accuracy, performance, and value in the field of agriculture. The source codes can be accessed at https://github.com/SdustZYP/ILCD-master/tree/main.

查看原文本刊更多论文

作物病害的知情学习引导视觉问答模型。

在当代农业中，专家们针对不同作物的不同病害阶段制定预防和补救策略。有关疾病发生阶段的决策超出了图像分类和物体检测等单一图像任务的能力。因此，目前的研究重点是训练视觉问题解答（VQA）模型。然而，现有的研究集中于识别疾病种类，而不是提出包含关键多属性的问题。此外，模型的性能易受模型结构和数据集偏差的影响。为了应对这些挑战，我们构建了作物病害的知情学习指导 VQA 模型（ILCD）。ILCD 通过整合共注意力、多模态融合模型（MUTAN）和偏差平衡（BiBa）策略来提高模型性能。为便于研究农作物病害的各种视觉属性并确定病害发生阶段，我们构建了一个新的 VQA 数据集，名为 "具有先验知识的农作物病害多属性 VQA（CDwPK-VQA）"。该数据集包含各种视觉属性的综合信息，如形状、大小、状态和颜色。我们通过将先验知识整合到 CDwPK-VQA 中来扩展该数据集，以应对性能挑战。ILCD 在 VQA-v2、VQA-CP v2 和 CDwPK-VQA 数据集上进行了对比实验，准确率分别达到 68.90%、49.75% 和 86.06%。在 CDwPK-VQA 上进行了消融实验，以评估包括 coattention、MUTAN 和 BiBa 在内的各种模块的有效性。这些实验证明，ILCD 在农业领域表现出最高水平的准确性、性能和价值。源代码可通过 https://github.com/SdustZYP/ILCD-master/tree/main 访问。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Plant Phenomics Multiple-

CiteScore

8.60

自引率

9.20%

发文量

审稿时长

14 weeks

期刊介绍： Plant Phenomics is an Open Access journal published in affiliation with the State Key Laboratory of Crop Genetics & Germplasm Enhancement, Nanjing Agricultural University (NAU) and published by the American Association for the Advancement of Science (AAAS). Like all partners participating in the Science Partner Journal program, Plant Phenomics is editorially independent from the Science family of journals. The mission of Plant Phenomics is to publish novel research that will advance all aspects of plant phenotyping from the cell to the plant population levels using innovative combinations of sensor systems and data analytics. Plant Phenomics aims also to connect phenomics to other science domains, such as genomics, genetics, physiology, molecular biology, bioinformatics, statistics, mathematics, and computer sciences. Plant Phenomics should thus contribute to advance plant sciences and agriculture/forestry/horticulture by addressing key scientific challenges in the area of plant phenomics. The scope of the journal covers the latest technologies in plant phenotyping for data acquisition, data management, data interpretation, modeling, and their practical applications for crop cultivation, plant breeding, forestry, horticulture, ecology, and other plant-related domains.