利用 LAFANet 的高精度番茄叶病图像-文本检索方法

Plants Pub Date : 2024-04-23 DOI:10.3390/plants13091176
Jiaxin Xu, Hongliang Zhou, Yufan Hu, Yongfei Xue, Guoxiong Zhou, Liujun Li, Weisi Dai, Jinyang Li
{"title":"利用 LAFANet 的高精度番茄叶病图像-文本检索方法","authors":"Jiaxin Xu, Hongliang Zhou, Yufan Hu, Yongfei Xue, Guoxiong Zhou, Liujun Li, Weisi Dai, Jinyang Li","doi":"10.3390/plants13091176","DOIUrl":null,"url":null,"abstract":"Tomato leaf disease control in the field of smart agriculture urgently requires attention and reinforcement. This paper proposes a method called LAFANet for image-text retrieval, which integrates image and text information for joint analysis of multimodal data, helping agricultural practitioners to provide more comprehensive and in-depth diagnostic evidence to ensure the quality and yield of tomatoes. First, we focus on six common tomato leaf disease images and text descriptions, creating a Tomato Leaf Disease Image-Text Retrieval Dataset (TLDITRD), introducing image-text retrieval into the field of tomato leaf disease retrieval. Then, utilizing ViT and BERT models, we extract detailed image features and sequences of textual features, incorporating contextual information from image-text pairs. To address errors in image-text retrieval caused by complex backgrounds, we propose Learnable Fusion Attention (LFA) to amplify the fusion of textual and image features, thereby extracting substantial semantic insights from both modalities. To delve further into the semantic connections across various modalities, we propose a False Negative Elimination-Adversarial Negative Selection (FNE-ANS) approach. This method aims to identify adversarial negative instances that specifically target false negatives within the triplet function, thereby imposing constraints on the model. To bolster the model’s capacity for generalization and precision, we propose Adversarial Regularization (AR). This approach involves incorporating adversarial perturbations during model training, thereby fortifying its resilience and adaptability to slight variations in input data. Experimental results show that, compared with existing ultramodern models, LAFANet outperformed existing models on TLDITRD dataset, with top1, top5, and top10 reaching 83.3% and 90.0%, and top1, top5, and top10 reaching 80.3%, 93.7%, and 96.3%. LAFANet offers fresh technical backing and algorithmic insights for the retrieval of tomato leaf disease through image-text correlation.","PeriodicalId":509472,"journal":{"name":"Plants","volume":"61 11","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High-Accuracy Tomato Leaf Disease Image-Text Retrieval Method Utilizing LAFANet\",\"authors\":\"Jiaxin Xu, Hongliang Zhou, Yufan Hu, Yongfei Xue, Guoxiong Zhou, Liujun Li, Weisi Dai, Jinyang Li\",\"doi\":\"10.3390/plants13091176\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Tomato leaf disease control in the field of smart agriculture urgently requires attention and reinforcement. This paper proposes a method called LAFANet for image-text retrieval, which integrates image and text information for joint analysis of multimodal data, helping agricultural practitioners to provide more comprehensive and in-depth diagnostic evidence to ensure the quality and yield of tomatoes. First, we focus on six common tomato leaf disease images and text descriptions, creating a Tomato Leaf Disease Image-Text Retrieval Dataset (TLDITRD), introducing image-text retrieval into the field of tomato leaf disease retrieval. Then, utilizing ViT and BERT models, we extract detailed image features and sequences of textual features, incorporating contextual information from image-text pairs. To address errors in image-text retrieval caused by complex backgrounds, we propose Learnable Fusion Attention (LFA) to amplify the fusion of textual and image features, thereby extracting substantial semantic insights from both modalities. To delve further into the semantic connections across various modalities, we propose a False Negative Elimination-Adversarial Negative Selection (FNE-ANS) approach. This method aims to identify adversarial negative instances that specifically target false negatives within the triplet function, thereby imposing constraints on the model. To bolster the model’s capacity for generalization and precision, we propose Adversarial Regularization (AR). This approach involves incorporating adversarial perturbations during model training, thereby fortifying its resilience and adaptability to slight variations in input data. Experimental results show that, compared with existing ultramodern models, LAFANet outperformed existing models on TLDITRD dataset, with top1, top5, and top10 reaching 83.3% and 90.0%, and top1, top5, and top10 reaching 80.3%, 93.7%, and 96.3%. LAFANet offers fresh technical backing and algorithmic insights for the retrieval of tomato leaf disease through image-text correlation.\",\"PeriodicalId\":509472,\"journal\":{\"name\":\"Plants\",\"volume\":\"61 11\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Plants\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/plants13091176\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plants","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/plants13091176","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

智慧农业领域的番茄叶病防控亟待重视和加强。本文提出了一种名为LAFANet的图像-文本检索方法,将图像和文本信息进行整合,对多模态数据进行联合分析,帮助农业从业者提供更全面、更深入的诊断依据,确保番茄的品质和产量。首先,我们聚焦六种常见番茄叶病图像和文本描述,创建了番茄叶病图像-文本检索数据集(TLDITRD),将图像-文本检索引入番茄叶病检索领域。然后,利用 ViT 和 BERT 模型,我们提取了详细的图像特征和文本特征序列,将上下文信息纳入图像-文本对中。为了解决复杂背景造成的图像-文本检索错误,我们提出了可学习融合注意力(LFA),以扩大文本和图像特征的融合,从而从两种模式中提取大量语义见解。为了进一步深入研究各种模态之间的语义联系,我们提出了一种假否定消除-对抗性否定选择(FNE-ANS)方法。该方法旨在识别专门针对三重函数中假否定的对抗性否定实例,从而对模型施加限制。为了提高模型的泛化能力和精确度,我们提出了对抗正则化(AR)方法。这种方法是在模型训练过程中加入对抗性扰动,从而加强模型对输入数据细微变化的弹性和适应性。实验结果表明,与现有的超现代模型相比,LAFANet 在 TLDITRD 数据集上的表现优于现有模型,top1、top5 和 top10 分别达到 83.3% 和 90.0%,top1、top5 和 top10 分别达到 80.3%、93.7% 和 96.3%。LAFANet 为通过图像-文本关联检索番茄叶病提供了新的技术支撑和算法见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
High-Accuracy Tomato Leaf Disease Image-Text Retrieval Method Utilizing LAFANet
Tomato leaf disease control in the field of smart agriculture urgently requires attention and reinforcement. This paper proposes a method called LAFANet for image-text retrieval, which integrates image and text information for joint analysis of multimodal data, helping agricultural practitioners to provide more comprehensive and in-depth diagnostic evidence to ensure the quality and yield of tomatoes. First, we focus on six common tomato leaf disease images and text descriptions, creating a Tomato Leaf Disease Image-Text Retrieval Dataset (TLDITRD), introducing image-text retrieval into the field of tomato leaf disease retrieval. Then, utilizing ViT and BERT models, we extract detailed image features and sequences of textual features, incorporating contextual information from image-text pairs. To address errors in image-text retrieval caused by complex backgrounds, we propose Learnable Fusion Attention (LFA) to amplify the fusion of textual and image features, thereby extracting substantial semantic insights from both modalities. To delve further into the semantic connections across various modalities, we propose a False Negative Elimination-Adversarial Negative Selection (FNE-ANS) approach. This method aims to identify adversarial negative instances that specifically target false negatives within the triplet function, thereby imposing constraints on the model. To bolster the model’s capacity for generalization and precision, we propose Adversarial Regularization (AR). This approach involves incorporating adversarial perturbations during model training, thereby fortifying its resilience and adaptability to slight variations in input data. Experimental results show that, compared with existing ultramodern models, LAFANet outperformed existing models on TLDITRD dataset, with top1, top5, and top10 reaching 83.3% and 90.0%, and top1, top5, and top10 reaching 80.3%, 93.7%, and 96.3%. LAFANet offers fresh technical backing and algorithmic insights for the retrieval of tomato leaf disease through image-text correlation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信