Assessing nutritional pigment content of green and red leafy vegetables by image analysis: Catching the "red herring" of plant digital color processing via machine learning.
Avinash Agarwal, Filipe de Jesus Colwell, Viviana Andrea Correa Galvis, Tom R Hill, Neil Boonham, Ankush Prashar
{"title":"Assessing nutritional pigment content of green and red leafy vegetables by image analysis: Catching the \"red herring\" of plant digital color processing via machine learning.","authors":"Avinash Agarwal, Filipe de Jesus Colwell, Viviana Andrea Correa Galvis, Tom R Hill, Neil Boonham, Ankush Prashar","doi":"10.1093/biomethods/bpaf027","DOIUrl":null,"url":null,"abstract":"<p><p>Estimating pigment content of leafy vegetables via digital image analysis is a reliable method for high-throughput assessment of their nutritional value. However, the current leaf color analysis models developed using green-leaved plants fail to perform reliably while analyzing images of anthocyanin (Anth)-rich red-leaved varieties due to misleading or \"red herring\" trends. Hence, the present study explores the potential for machine learning (ML)-based estimation of nutritional pigment content for green and red leafy vegetables simultaneously using digital color features. For this, images of <i>n </i>=<i> </i>320 samples from six types of leafy vegetables with varying pigment profiles were acquired using a smartphone camera, followed by extract-based estimation of chlorophyll (Chl), carotenoid (Car), and Anth. Subsequently, three ML methods, namely, Partial Least Squares Regression (PLSR), Support Vector Regression (SVR), and Random Forest Regression (RFR), were tested for predicting pigment contents using RGB (Red, Green, Blue), HSV (Hue, Saturation, Value), and <i>L*a*b*</i> (Lightness, Redness-greenness, Yellowness-blueness) datasets individually and in combination. Chl and Car contents were predicted most accurately using the combined colorimetric dataset via SVR (<i>R<sup>2</sup></i> = 0.738) and RFR (<i>R<sup>2</sup></i> = 0.573), respectively. Conversely, Anth content was predicted most accurately using SVR with HSV data (<i>R<sup>2</sup></i> = 0.818). While Chl and Car could be predicted reliably for green-leaved and Anth-rich samples, Anth could be estimated accurately only for Anth-rich samples due to Anth masking by Chl in green-leaved samples. Thus, the present findings demonstrate the scope of implementing ML-based leaf color analysis for assessing the nutritional pigment content of red and green leafy vegetables in tandem.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf027"},"PeriodicalIF":2.5000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12057810/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biology Methods and Protocols","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/biomethods/bpaf027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Estimating pigment content of leafy vegetables via digital image analysis is a reliable method for high-throughput assessment of their nutritional value. However, the current leaf color analysis models developed using green-leaved plants fail to perform reliably while analyzing images of anthocyanin (Anth)-rich red-leaved varieties due to misleading or "red herring" trends. Hence, the present study explores the potential for machine learning (ML)-based estimation of nutritional pigment content for green and red leafy vegetables simultaneously using digital color features. For this, images of n =320 samples from six types of leafy vegetables with varying pigment profiles were acquired using a smartphone camera, followed by extract-based estimation of chlorophyll (Chl), carotenoid (Car), and Anth. Subsequently, three ML methods, namely, Partial Least Squares Regression (PLSR), Support Vector Regression (SVR), and Random Forest Regression (RFR), were tested for predicting pigment contents using RGB (Red, Green, Blue), HSV (Hue, Saturation, Value), and L*a*b* (Lightness, Redness-greenness, Yellowness-blueness) datasets individually and in combination. Chl and Car contents were predicted most accurately using the combined colorimetric dataset via SVR (R2 = 0.738) and RFR (R2 = 0.573), respectively. Conversely, Anth content was predicted most accurately using SVR with HSV data (R2 = 0.818). While Chl and Car could be predicted reliably for green-leaved and Anth-rich samples, Anth could be estimated accurately only for Anth-rich samples due to Anth masking by Chl in green-leaved samples. Thus, the present findings demonstrate the scope of implementing ML-based leaf color analysis for assessing the nutritional pigment content of red and green leafy vegetables in tandem.