Informed-Learning-Guided Visual Question Answering Model of Crop Disease.

IF 7.6 1区 农林科学 Q1 AGRONOMY
Plant Phenomics Pub Date : 2024-12-16 eCollection Date: 2024-01-01 DOI:10.34133/plantphenomics.0277
Yunpeng Zhao, Shansong Wang, Qingtian Zeng, Weijian Ni, Hua Duan, Nengfu Xie, Fengjin Xiao
{"title":"Informed-Learning-Guided Visual Question Answering Model of Crop Disease.","authors":"Yunpeng Zhao, Shansong Wang, Qingtian Zeng, Weijian Ni, Hua Duan, Nengfu Xie, Fengjin Xiao","doi":"10.34133/plantphenomics.0277","DOIUrl":null,"url":null,"abstract":"<p><p>In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image classification and object detection. Consequently, research now focuses on training visual question answering (VQA) models. However, existing studies concentrate on identifying disease species rather than formulating questions that encompass crucial multiattributes. Additionally, model performance is susceptible to the model structure and dataset biases. To address these challenges, we construct the informed-learning-guided VQA model of crop disease (ILCD). ILCD improves model performance by integrating coattention, a multimodal fusion model (MUTAN), and a bias-balancing (BiBa) strategy. To facilitate the investigation of various visual attributes of crop diseases and the determination of disease occurrence stages, we construct a new VQA dataset called the Crop Disease Multi-attribute VQA with Prior Knowledge (CDwPK-VQA). This dataset contains comprehensive information on various visual attributes such as shape, size, status, and color. We expand the dataset by integrating prior knowledge into CDwPK-VQA to address performance challenges. Comparative experiments are conducted by ILCD on the VQA-v2, VQA-CP v2, and CDwPK-VQA datasets, achieving accuracies of 68.90%, 49.75%, and 86.06%, respectively. Ablation experiments are conducted on CDwPK-VQA to evaluate the effectiveness of various modules, including coattention, MUTAN, and BiBa. These experiments demonstrate that ILCD exhibits the highest level of accuracy, performance, and value in the field of agriculture. The source codes can be accessed at https://github.com/SdustZYP/ILCD-master/tree/main.</p>","PeriodicalId":20318,"journal":{"name":"Plant Phenomics","volume":"6 ","pages":"0277"},"PeriodicalIF":7.6000,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11649200/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Phenomics","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.34133/plantphenomics.0277","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0

Abstract

In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image classification and object detection. Consequently, research now focuses on training visual question answering (VQA) models. However, existing studies concentrate on identifying disease species rather than formulating questions that encompass crucial multiattributes. Additionally, model performance is susceptible to the model structure and dataset biases. To address these challenges, we construct the informed-learning-guided VQA model of crop disease (ILCD). ILCD improves model performance by integrating coattention, a multimodal fusion model (MUTAN), and a bias-balancing (BiBa) strategy. To facilitate the investigation of various visual attributes of crop diseases and the determination of disease occurrence stages, we construct a new VQA dataset called the Crop Disease Multi-attribute VQA with Prior Knowledge (CDwPK-VQA). This dataset contains comprehensive information on various visual attributes such as shape, size, status, and color. We expand the dataset by integrating prior knowledge into CDwPK-VQA to address performance challenges. Comparative experiments are conducted by ILCD on the VQA-v2, VQA-CP v2, and CDwPK-VQA datasets, achieving accuracies of 68.90%, 49.75%, and 86.06%, respectively. Ablation experiments are conducted on CDwPK-VQA to evaluate the effectiveness of various modules, including coattention, MUTAN, and BiBa. These experiments demonstrate that ILCD exhibits the highest level of accuracy, performance, and value in the field of agriculture. The source codes can be accessed at https://github.com/SdustZYP/ILCD-master/tree/main.

在当代农业中,专家们针对不同作物的不同病害阶段制定预防和补救策略。有关疾病发生阶段的决策超出了图像分类和物体检测等单一图像任务的能力。因此,目前的研究重点是训练视觉问题解答(VQA)模型。然而,现有的研究集中于识别疾病种类,而不是提出包含关键多属性的问题。此外,模型的性能易受模型结构和数据集偏差的影响。为了应对这些挑战,我们构建了作物病害的知情学习指导 VQA 模型(ILCD)。ILCD 通过整合共注意力、多模态融合模型(MUTAN)和偏差平衡(BiBa)策略来提高模型性能。为便于研究农作物病害的各种视觉属性并确定病害发生阶段,我们构建了一个新的 VQA 数据集,名为 "具有先验知识的农作物病害多属性 VQA(CDwPK-VQA)"。该数据集包含各种视觉属性的综合信息,如形状、大小、状态和颜色。我们通过将先验知识整合到 CDwPK-VQA 中来扩展该数据集,以应对性能挑战。ILCD 在 VQA-v2、VQA-CP v2 和 CDwPK-VQA 数据集上进行了对比实验,准确率分别达到 68.90%、49.75% 和 86.06%。在 CDwPK-VQA 上进行了消融实验,以评估包括 coattention、MUTAN 和 BiBa 在内的各种模块的有效性。这些实验证明,ILCD 在农业领域表现出最高水平的准确性、性能和价值。源代码可通过 https://github.com/SdustZYP/ILCD-master/tree/main 访问。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Plant Phenomics
Plant Phenomics Multiple-
CiteScore
8.60
自引率
9.20%
发文量
26
审稿时长
14 weeks
期刊介绍: Plant Phenomics is an Open Access journal published in affiliation with the State Key Laboratory of Crop Genetics & Germplasm Enhancement, Nanjing Agricultural University (NAU) and published by the American Association for the Advancement of Science (AAAS). Like all partners participating in the Science Partner Journal program, Plant Phenomics is editorially independent from the Science family of journals. The mission of Plant Phenomics is to publish novel research that will advance all aspects of plant phenotyping from the cell to the plant population levels using innovative combinations of sensor systems and data analytics. Plant Phenomics aims also to connect phenomics to other science domains, such as genomics, genetics, physiology, molecular biology, bioinformatics, statistics, mathematics, and computer sciences. Plant Phenomics should thus contribute to advance plant sciences and agriculture/forestry/horticulture by addressing key scientific challenges in the area of plant phenomics. The scope of the journal covers the latest technologies in plant phenotyping for data acquisition, data management, data interpretation, modeling, and their practical applications for crop cultivation, plant breeding, forestry, horticulture, ecology, and other plant-related domains.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信