Yang Hu , Jiale Zhu , Guoxiong Zhou , Mingfang He , Mingjie Lv , Junhui Wang , Aibin Chen , Jinsheng Deng , Yichu Jiang
{"title":"LVF: A language and vision fusion framework for tomato diseases segmentation","authors":"Yang Hu , Jiale Zhu , Guoxiong Zhou , Mingfang He , Mingjie Lv , Junhui Wang , Aibin Chen , Jinsheng Deng , Yichu Jiang","doi":"10.1016/j.compag.2024.109484","DOIUrl":null,"url":null,"abstract":"<div><div>With the development of deep learning technology, the control of tomato diseases has emerged as a crucial aspect of intelligent agricultural management. While current research on tomato disease segmentation has made considerable strides, challenges persist due to the susceptibility of tomato leaf diseases to strong light reflections and shadow gradients in sunlight. Additionally, the complex backgrounds found in agricultural fields often lead to model confusion, resulting in inaccurate segmentation. Traditional methods for tomato disease segmentation rely on single-modal image-based models, which struggle when dealing with the nuanced features and limited scope of tomato leaf diseases. To address these issues, our study introduces the LVF framework, a dual-modal approach combining image and text information for pre-segmentation of tomato diseases. We began by creating a new dataset labeled with both images and text, specifically focusing on diseased tomato leaves with guidance from agricultural experts. For image processing, we developed a probabilistic differential fusion network to mitigate interference caused by high-frequency noise, leveraging color and grayscale images. Furthermore, our reinforcement feature network and threshold filtering network enhance useful information while filtering out negative information from the fused images. In text processing, we proposed a multi-scale cross-nesting network to integrate semantic information about diseases across different scales and types. By nesting Bert-processed word vectors with fused image vectors, our model gains a deeper understanding of semantic information, thereby improving its ability to segment crop diseases accurately. Our experiments, conducted on self-constructed tomato datasets as well as public datasets for tomatoes and maize, demonstrated the efficacy and robustness of our approach in leaf disease segmentation. The LVF framework offers a valuable tool to enhance the accuracy of crop disease segmentation, especially in complex agricultural environments.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":null,"pages":null},"PeriodicalIF":7.7000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169924008755","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
With the development of deep learning technology, the control of tomato diseases has emerged as a crucial aspect of intelligent agricultural management. While current research on tomato disease segmentation has made considerable strides, challenges persist due to the susceptibility of tomato leaf diseases to strong light reflections and shadow gradients in sunlight. Additionally, the complex backgrounds found in agricultural fields often lead to model confusion, resulting in inaccurate segmentation. Traditional methods for tomato disease segmentation rely on single-modal image-based models, which struggle when dealing with the nuanced features and limited scope of tomato leaf diseases. To address these issues, our study introduces the LVF framework, a dual-modal approach combining image and text information for pre-segmentation of tomato diseases. We began by creating a new dataset labeled with both images and text, specifically focusing on diseased tomato leaves with guidance from agricultural experts. For image processing, we developed a probabilistic differential fusion network to mitigate interference caused by high-frequency noise, leveraging color and grayscale images. Furthermore, our reinforcement feature network and threshold filtering network enhance useful information while filtering out negative information from the fused images. In text processing, we proposed a multi-scale cross-nesting network to integrate semantic information about diseases across different scales and types. By nesting Bert-processed word vectors with fused image vectors, our model gains a deeper understanding of semantic information, thereby improving its ability to segment crop diseases accurately. Our experiments, conducted on self-constructed tomato datasets as well as public datasets for tomatoes and maize, demonstrated the efficacy and robustness of our approach in leaf disease segmentation. The LVF framework offers a valuable tool to enhance the accuracy of crop disease segmentation, especially in complex agricultural environments.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.