{"title":"结合轻型视觉变压器的智能葡萄园叶片密度测量方法","authors":"Shan Ke , Guowei Dai , Hui Pan , Bowen Jin","doi":"10.1016/j.eij.2024.100456","DOIUrl":null,"url":null,"abstract":"<div><p>Under the new demand model of Agriculture 4.0, automated spraying is a very complex task in precision agriculture, which needs to be combined with a computerized vision perception system to distinguish the plant leaf density and execute the spraying operation in real-time accordingly. Aiming at the accurate determination of grape leaf density, an image leaf density determination method based on the lightweight Vision Transformer (ViT) architecture is proposed, which designs a fusion data augmentation method containing a dual augmentation spatial extension and weather data augmentation method, where the former adopts the pixel augmentation and spatial augmentation for the original image processing, and the latter realizes the data augmentation from the empirical point of view adapted to the agricultural operation environment, and fuses the two in order to expand the sample capacity of the grape leaf density image, which then enhances the model's generalization ability and robustness. The lightweight ViT model has self-attention that can automatically and efficiently extract high-frequency local feature representations and use the two-branch structure to mix high-frequency and low-frequency information to form grapevine-leaf density features in the region of interest. The semantic analysis of the feature extraction layer is parsed using t-SNE and histogram methods, which improves the transparency of the model from the multidimensional with frequency domain distribution space. The experimental results show that the fusion data augmentation method can effectively improve the model recognition accuracy, and the accuracy of comparing the included data augmentation methods is improved by 0.55 % and 3.46 %, respectively. The accuracy of recognizing all four types of grape leaf densities exceeded 94 %, and the MCC reached 90.39 %. In addition, the proposed lightweight ViT improves the accuracy by at least 0.34 % with FLOPs of only 0.6 G compared to the popular MobileViT. The proposed method of this work has high recognition speed and accuracy, which can provide practical technical support for plant protection spraying robots and improve the profitability of growers based on the reduction of pesticide residues.</p></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1110866524000197/pdfft?md5=52d8334c13dafd8abe1906870a9c1190&pid=1-s2.0-S1110866524000197-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Intelligent vineyard blade density measurement method incorporating a lightweight vision transformer\",\"authors\":\"Shan Ke , Guowei Dai , Hui Pan , Bowen Jin\",\"doi\":\"10.1016/j.eij.2024.100456\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Under the new demand model of Agriculture 4.0, automated spraying is a very complex task in precision agriculture, which needs to be combined with a computerized vision perception system to distinguish the plant leaf density and execute the spraying operation in real-time accordingly. Aiming at the accurate determination of grape leaf density, an image leaf density determination method based on the lightweight Vision Transformer (ViT) architecture is proposed, which designs a fusion data augmentation method containing a dual augmentation spatial extension and weather data augmentation method, where the former adopts the pixel augmentation and spatial augmentation for the original image processing, and the latter realizes the data augmentation from the empirical point of view adapted to the agricultural operation environment, and fuses the two in order to expand the sample capacity of the grape leaf density image, which then enhances the model's generalization ability and robustness. The lightweight ViT model has self-attention that can automatically and efficiently extract high-frequency local feature representations and use the two-branch structure to mix high-frequency and low-frequency information to form grapevine-leaf density features in the region of interest. The semantic analysis of the feature extraction layer is parsed using t-SNE and histogram methods, which improves the transparency of the model from the multidimensional with frequency domain distribution space. The experimental results show that the fusion data augmentation method can effectively improve the model recognition accuracy, and the accuracy of comparing the included data augmentation methods is improved by 0.55 % and 3.46 %, respectively. The accuracy of recognizing all four types of grape leaf densities exceeded 94 %, and the MCC reached 90.39 %. In addition, the proposed lightweight ViT improves the accuracy by at least 0.34 % with FLOPs of only 0.6 G compared to the popular MobileViT. The proposed method of this work has high recognition speed and accuracy, which can provide practical technical support for plant protection spraying robots and improve the profitability of growers based on the reduction of pesticide residues.</p></div>\",\"PeriodicalId\":56010,\"journal\":{\"name\":\"Egyptian Informatics Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-03-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1110866524000197/pdfft?md5=52d8334c13dafd8abe1906870a9c1190&pid=1-s2.0-S1110866524000197-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Egyptian Informatics Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1110866524000197\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110866524000197","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
在农业 4.0 的新需求模式下,自动喷洒是精准农业中一项非常复杂的任务,需要结合计算机视觉感知系统来分辨植物叶片密度,并据此实时执行喷洒作业。针对葡萄叶片密度的精确测定,提出了一种基于轻量级视觉变换器(ViT)架构的图像叶片密度测定方法,该方法设计了一种融合数据增强方法,包含双重增强空间扩展和气象数据增强方法、前者采用像素增强和空间增强对原始图像进行处理,后者从适应农业作业环境的经验角度实现数据增强,并将二者融合以扩大葡萄叶片密度图像的样本容量,从而增强模型的泛化能力和鲁棒性。轻量级 ViT 模型具有自注意力,能自动高效地提取高频局部特征表征,并利用双分支结构混合高频和低频信息,形成感兴趣区域的葡萄叶密度特征。特征提取层的语义分析采用 t-SNE 和直方图方法进行解析,从多维与频域分布空间提高了模型的透明度。实验结果表明,融合数据增强方法能有效提高模型识别准确率,与包含的数据增强方法相比,准确率分别提高了 0.55 % 和 3.46 %。四种葡萄叶片密度的识别准确率均超过 94%,MCC 达到 90.39%。此外,与流行的 MobileViT 相比,所提出的轻量级 ViT 在 FLOP 值仅为 0.6 G 的情况下提高了至少 0.34 % 的准确率。本研究提出的方法具有较高的识别速度和准确度,可为植保喷洒机器人提供实用的技术支持,并在减少农药残留的基础上提高种植者的收益。
Intelligent vineyard blade density measurement method incorporating a lightweight vision transformer
Under the new demand model of Agriculture 4.0, automated spraying is a very complex task in precision agriculture, which needs to be combined with a computerized vision perception system to distinguish the plant leaf density and execute the spraying operation in real-time accordingly. Aiming at the accurate determination of grape leaf density, an image leaf density determination method based on the lightweight Vision Transformer (ViT) architecture is proposed, which designs a fusion data augmentation method containing a dual augmentation spatial extension and weather data augmentation method, where the former adopts the pixel augmentation and spatial augmentation for the original image processing, and the latter realizes the data augmentation from the empirical point of view adapted to the agricultural operation environment, and fuses the two in order to expand the sample capacity of the grape leaf density image, which then enhances the model's generalization ability and robustness. The lightweight ViT model has self-attention that can automatically and efficiently extract high-frequency local feature representations and use the two-branch structure to mix high-frequency and low-frequency information to form grapevine-leaf density features in the region of interest. The semantic analysis of the feature extraction layer is parsed using t-SNE and histogram methods, which improves the transparency of the model from the multidimensional with frequency domain distribution space. The experimental results show that the fusion data augmentation method can effectively improve the model recognition accuracy, and the accuracy of comparing the included data augmentation methods is improved by 0.55 % and 3.46 %, respectively. The accuracy of recognizing all four types of grape leaf densities exceeded 94 %, and the MCC reached 90.39 %. In addition, the proposed lightweight ViT improves the accuracy by at least 0.34 % with FLOPs of only 0.6 G compared to the popular MobileViT. The proposed method of this work has high recognition speed and accuracy, which can provide practical technical support for plant protection spraying robots and improve the profitability of growers based on the reduction of pesticide residues.
期刊介绍:
The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.