Phenotyping Alfalfa (Medicago sativa L.) Root Structure Architecture via Integrating Confident Machine Learning with ResNet-18.

IF 6.4 1区农林科学 Q1 AGRONOMY

Plant Phenomics Pub Date : 2024-09-11 DOI:10.34133/plantphenomics.0251

Brandon J Weihs,Zhou Tang,Zezhong Tian,Deborah Jo Heuschele,Aftab Siddique,Thomas H Terrill,Zhou Zhang,Larry M York,Zhiwu Zhang,Zhanyou Xu

{"title":"Phenotyping Alfalfa (Medicago sativa L.) Root Structure Architecture via Integrating Confident Machine Learning with ResNet-18.","authors":"Brandon J Weihs,Zhou Tang,Zezhong Tian,Deborah Jo Heuschele,Aftab Siddique,Thomas H Terrill,Zhou Zhang,Larry M York,Zhiwu Zhang,Zhanyou Xu","doi":"10.34133/plantphenomics.0251","DOIUrl":null,"url":null,"abstract":"Background: Root system architecture (RSA) is of growing interest in implementing plant improvements with belowground root traits. Modern computing technology applied to images offers new pathways forward to plant trait improvements and selection through RSA analysis (using images to discern/classify root types and traits). However, a major stumbling block to image-based RSA phenotyping is image label noise, which reduces the accuracies of models that take images as direct inputs. To address the label noise problem, this study utilized an artificial intelligence model capable of classifying the RSA of alfalfa (Medicago sativa L.) directly from images and coupled it with downstream label improvement methods. Images were compared with different model outputs with manual root classifications, and confident machine learning (CL) and reactive machine learning (RL) methods were tested to minimize the effects of subjective labeling to improve labeling and prediction accuracies. Results: The CL algorithm modestly improved the Random Forest model's overall prediction accuracy of the Minnesota dataset (1%) while larger gains in accuracy were observed with the ResNet-18 model results. The ResNet-18 cross-population prediction accuracy was improved (~8% to 13%) with CL compared to the original/preprocessed datasets. Training and testing data combinations with the highest accuracies (86%) resulted from the CL- and/or RL-corrected datasets for predicting taproot RSAs. Similarly, the highest accuracies achieved for the intermediate RSA class resulted from corrected data combinations. The highest overall accuracy (~75%) using the ResNet-18 model involved CL on a pooled dataset containing images from both sample locations. Conclusions: ResNet-18 DNN prediction accuracies of alfalfa RSA image labels are increased when CL and RL are employed. By increasing the dataset to reduce overfitting while concurrently finding and correcting image label errors, it is demonstrated here that accuracy increases by as much as ~11% to 13% can be achieved with semi-automated, computer-assisted preprocessing and data cleaning (CL/RL).","PeriodicalId":20318,"journal":{"name":"Plant Phenomics","volume":"57 1","pages":"0251"},"PeriodicalIF":6.4000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Phenomics","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.34133/plantphenomics.0251","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Root system architecture (RSA) is of growing interest in implementing plant improvements with belowground root traits. Modern computing technology applied to images offers new pathways forward to plant trait improvements and selection through RSA analysis (using images to discern/classify root types and traits). However, a major stumbling block to image-based RSA phenotyping is image label noise, which reduces the accuracies of models that take images as direct inputs. To address the label noise problem, this study utilized an artificial intelligence model capable of classifying the RSA of alfalfa (Medicago sativa L.) directly from images and coupled it with downstream label improvement methods. Images were compared with different model outputs with manual root classifications, and confident machine learning (CL) and reactive machine learning (RL) methods were tested to minimize the effects of subjective labeling to improve labeling and prediction accuracies. Results: The CL algorithm modestly improved the Random Forest model's overall prediction accuracy of the Minnesota dataset (1%) while larger gains in accuracy were observed with the ResNet-18 model results. The ResNet-18 cross-population prediction accuracy was improved (~8% to 13%) with CL compared to the original/preprocessed datasets. Training and testing data combinations with the highest accuracies (86%) resulted from the CL- and/or RL-corrected datasets for predicting taproot RSAs. Similarly, the highest accuracies achieved for the intermediate RSA class resulted from corrected data combinations. The highest overall accuracy (~75%) using the ResNet-18 model involved CL on a pooled dataset containing images from both sample locations. Conclusions: ResNet-18 DNN prediction accuracies of alfalfa RSA image labels are increased when CL and RL are employed. By increasing the dataset to reduce overfitting while concurrently finding and correcting image label errors, it is demonstrated here that accuracy increases by as much as ~11% to 13% can be achieved with semi-automated, computer-assisted preprocessing and data cleaning (CL/RL).

查看原文本刊更多论文

通过将可信机器学习与 ResNet-18 相结合，对紫花苜蓿（Medicago sativa L.）根结构进行表型。

背景：根系结构（RSA）在利用地下根系性状进行植物改良方面日益受到关注。将现代计算技术应用于图像，为通过 RSA 分析（利用图像辨别/分类根系类型和性状）改进和选择植物性状提供了新的途径。然而，图像标签噪声是基于图像的 RSA 表型分析的一大绊脚石，它会降低以图像为直接输入的模型的准确性。为解决标签噪声问题，本研究利用人工智能模型直接从图像中对紫花苜蓿（Medicago sativa L.）的 RSA 进行分类，并将其与下游标签改进方法相结合。将不同模型输出的图像与人工根分类进行了比较，并测试了自信机器学习（CL）和反应机器学习（RL）方法，以尽量减少主观标签的影响，提高标签和预测的准确性。结果显示CL 算法适度提高了随机森林模型对明尼苏达州数据集的总体预测准确率（1%），而 ResNet-18 模型结果的准确率提高幅度更大。与原始/预处理数据集相比，CL 提高了 ResNet-18 的跨群预测准确率（从约 8% 提高到 13%）。经过 CL 和/或 RL 校正的数据集预测直根 RSA 的训练和测试数据组合的准确率最高（86%）。同样，中间 RSA 类别的最高准确率也来自校正后的数据组合。使用 ResNet-18 模型获得的最高总体准确率（约 75%）是在包含两个样本位置图像的集合数据集上使用 CL 得到的。结论采用 CL 和 RL 时，ResNet-18 DNN 对苜蓿 RSA 图像标签的预测准确率有所提高。通过增加数据集以减少过拟合，同时发现并纠正图像标签错误，本文证明了半自动计算机辅助预处理和数据清理（CL/RL）可将准确率提高约 11% 至 13%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Plant Phenomics Multiple-

CiteScore

8.60

自引率

9.20%

发文量

审稿时长

14 weeks

期刊介绍： Plant Phenomics is an Open Access journal published in affiliation with the State Key Laboratory of Crop Genetics & Germplasm Enhancement, Nanjing Agricultural University (NAU) and published by the American Association for the Advancement of Science (AAAS). Like all partners participating in the Science Partner Journal program, Plant Phenomics is editorially independent from the Science family of journals. The mission of Plant Phenomics is to publish novel research that will advance all aspects of plant phenotyping from the cell to the plant population levels using innovative combinations of sensor systems and data analytics. Plant Phenomics aims also to connect phenomics to other science domains, such as genomics, genetics, physiology, molecular biology, bioinformatics, statistics, mathematics, and computer sciences. Plant Phenomics should thus contribute to advance plant sciences and agriculture/forestry/horticulture by addressing key scientific challenges in the area of plant phenomics. The scope of the journal covers the latest technologies in plant phenotyping for data acquisition, data management, data interpretation, modeling, and their practical applications for crop cultivation, plant breeding, forestry, horticulture, ecology, and other plant-related domains.