Deep learning for broadleaf weed seedlings classification incorporating data variability and model flexibility across two contrasting environments

IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY

Artificial Intelligence in Agriculture Pub Date : 2024-03-13 DOI:10.1016/j.aiia.2024.03.002

Lorenzo León , Cristóbal Campos , Juan Hirzel

{"title":"Deep learning for broadleaf weed seedlings classification incorporating data variability and model flexibility across two contrasting environments","authors":"Lorenzo León , Cristóbal Campos , Juan Hirzel","doi":"10.1016/j.aiia.2024.03.002","DOIUrl":null,"url":null,"abstract":"<div>The increasing deployment of deep learning models for distinguishing weeds and crops has witnessed notable strides in agricultural scenarios. However, a conspicuous gap endures in the literature concerning the training and testing of models across disparate environmental conditions. Predominant methodologies either delineate a single dataset distribution into training, validation, and testing subsets or merge datasets from diverse conditions or distributions before their division into the subsets. Our study aims to ameliorate this gap by extending to several broadleaf weed categories across varied distributions, evaluating the impact of training convolutional neural networks on datasets specific to particular conditions or distributions, and assessing their performance in entirely distinct settings through three experiments. By evaluating diverse network architectures and training approaches (finetuning versus feature extraction), testing various architectures, employing different training strategies, and amalgamating data, we devised straightforward guidelines to ensure the model's deployability in contrasting environments with sustained precision and accuracy.In Experiment 1, conducted in a uniform environment, accuracy ranged from 80% to 100% across all models and training strategies, with finetune mode achieving a superior performance of 94% to 99.9% compared to the feature extraction mode at 80% to 92.96%. Experiment 2 underscored a significant performance decline, with accuracy figures between 25% and 60%, primarily at 40%, when the origin of the test data deviated from the train and validation sets. Experiment 3, spotlighting dataset and distribution amalgamation, yielded promising accuracy metrics, notably a peak of 99.6% for ResNet in finetuning mode to a low of 69.9% for InceptionV3 in feature extraction mode. These pivotal findings emphasize that merging data from diverse distributions, coupled with finetuned training on advanced architectures like ResNet and MobileNet, markedly enhances performance, contrasting with the relatively lower performance exhibited by simpler networks like AlexNet. Our results suggest that embracing data diversity and flexible training methodologies are crucial for optimizing weed classification models when disparate data distributions are available. This study gives a practical alternative for treating diverse datasets with real-world agricultural variances.</div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"12 ","pages":"Pages 29-43"},"PeriodicalIF":12.4000,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589721724000059/pdfft?md5=d8051b8dea55cec53a6ba7889cbc0c03&pid=1-s2.0-S2589721724000059-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Agriculture","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589721724000059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

The increasing deployment of deep learning models for distinguishing weeds and crops has witnessed notable strides in agricultural scenarios. However, a conspicuous gap endures in the literature concerning the training and testing of models across disparate environmental conditions. Predominant methodologies either delineate a single dataset distribution into training, validation, and testing subsets or merge datasets from diverse conditions or distributions before their division into the subsets. Our study aims to ameliorate this gap by extending to several broadleaf weed categories across varied distributions, evaluating the impact of training convolutional neural networks on datasets specific to particular conditions or distributions, and assessing their performance in entirely distinct settings through three experiments. By evaluating diverse network architectures and training approaches (finetuning versus feature extraction), testing various architectures, employing different training strategies, and amalgamating data, we devised straightforward guidelines to ensure the model's deployability in contrasting environments with sustained precision and accuracy.

In Experiment 1, conducted in a uniform environment, accuracy ranged from 80% to 100% across all models and training strategies, with finetune mode achieving a superior performance of 94% to 99.9% compared to the feature extraction mode at 80% to 92.96%. Experiment 2 underscored a significant performance decline, with accuracy figures between 25% and 60%, primarily at 40%, when the origin of the test data deviated from the train and validation sets. Experiment 3, spotlighting dataset and distribution amalgamation, yielded promising accuracy metrics, notably a peak of 99.6% for ResNet in finetuning mode to a low of 69.9% for InceptionV3 in feature extraction mode. These pivotal findings emphasize that merging data from diverse distributions, coupled with finetuned training on advanced architectures like ResNet and MobileNet, markedly enhances performance, contrasting with the relatively lower performance exhibited by simpler networks like AlexNet. Our results suggest that embracing data diversity and flexible training methodologies are crucial for optimizing weed classification models when disparate data distributions are available. This study gives a practical alternative for treating diverse datasets with real-world agricultural variances.

查看原文本刊更多论文

深度学习用于阔叶杂草幼苗分类，在两种截然不同的环境中结合数据的可变性和模型的灵活性

越来越多的深度学习模型被用于区分杂草和农作物，在农业场景中取得了显著进展。然而，关于在不同环境条件下训练和测试模型的文献仍存在明显差距。主流方法要么将单一数据集分布划分为训练、验证和测试子集，要么在划分子集之前合并来自不同条件或分布的数据集。我们的研究旨在改善这一差距，我们将研究范围扩展到了不同分布的几种阔叶杂草类别，评估了卷积神经网络在特定条件或分布的数据集上进行训练的影响，并通过三项实验评估了它们在完全不同的环境中的表现。通过评估不同的网络架构和训练方法（微调与特征提取）、测试不同的架构、采用不同的训练策略以及合并数据，我们制定了简单明了的指导原则，以确保该模型可在不同环境中部署，并保持持续的精确度和准确性。在统一环境下进行的实验 1 中，所有模型和训练策略的准确率在 80% 到 100% 之间，微调模式的准确率为 94% 到 99.9%，而特征提取模式的准确率为 80% 到 92.96%。实验 2 显示，当测试数据的来源偏离训练集和验证集时，性能显著下降，准确率在 25% 到 60% 之间，主要是 40%。实验 3 重点考察了数据集和分布的合并情况，结果显示准确率指标很有希望，特别是在微调模式下，ResNet 的准确率最高达到 99.6%，而在特征提取模式下，InceptionV3 的准确率最低为 69.9%。这些重要发现强调，合并来自不同分布的数据，再加上在 ResNet 和 MobileNet 等高级架构上进行微调训练，可显著提高性能，而 AlexNet 等简单网络的性能则相对较低。我们的研究结果表明，在有不同数据分布的情况下，拥抱数据多样性和灵活的训练方法对于优化杂草分类模型至关重要。这项研究为处理具有真实世界农业差异的多样化数据集提供了一种实用的选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊