Shiyu Liu , Yiannis Ampatzidis , Congliang Zhou , Won Suk Lee
{"title":"AI-driven time series analysis for predicting strawberry weekly yields integrating fruit monitoring and weather data for optimized harvest planning","authors":"Shiyu Liu , Yiannis Ampatzidis , Congliang Zhou , Won Suk Lee","doi":"10.1016/j.compag.2025.110212","DOIUrl":null,"url":null,"abstract":"<div><div>Strawberries, as an indeterminate crop, produce fruit multiple times per season, making fruit monitoring and wave-specific yield prediction essential for optimizing harvest planning. This study developed an AI-driven approach to predict next week’s yield using real-time plant image data collected by a machine vision system and environmental data. YOLOv8n was employed to count flowers, immature fruit, and mature fruit per plant, with manual counts used to evaluate the system’s accuracy. The YOLOv8n-based data, combined with weather features, were used to train several AI models for yield prediction. These models included traditional time series machine learning approaches, such as Multiple Linear Regression (MLR) with time lag features, Vector Autoregression (VAR), Gradient Boosting Machines (GBM), Random Forest, and deep learning time-series models, including Long Short-Term Memory (LSTM) and Temporal Convolutional Networks (TCN). Recursive Feature Elimination (RFE) was employed to identify the most relevant features. The performance of these models was evaluated across three strawberry varieties: Sensation, Brilliance, and Medallion. Results showed that MLR outperformed other models for Sensation and Brilliance, with R<sup>2</sup> values of 0.633 and 0.908, respectively. For Medallion, GBM achieved the best performance with an R<sup>2</sup> score of 0.848. LSTM, which outperformed TCN, achieved R<sup>2</sup> scores of 0.522 (Sensation), 0.839 (Brilliance), and 0.740 (Medallion). This AI-driven system automates yield forecasting, reducing labor costs and enabling more efficient harvest planning. The study highlights the potential of combining machine vision and predictive analytics for precise, scalable yield prediction, offering valuable insights for proactive farm management and supply chain optimization.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"233 ","pages":"Article 110212"},"PeriodicalIF":7.7000,"publicationDate":"2025-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925003187","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Strawberries, as an indeterminate crop, produce fruit multiple times per season, making fruit monitoring and wave-specific yield prediction essential for optimizing harvest planning. This study developed an AI-driven approach to predict next week’s yield using real-time plant image data collected by a machine vision system and environmental data. YOLOv8n was employed to count flowers, immature fruit, and mature fruit per plant, with manual counts used to evaluate the system’s accuracy. The YOLOv8n-based data, combined with weather features, were used to train several AI models for yield prediction. These models included traditional time series machine learning approaches, such as Multiple Linear Regression (MLR) with time lag features, Vector Autoregression (VAR), Gradient Boosting Machines (GBM), Random Forest, and deep learning time-series models, including Long Short-Term Memory (LSTM) and Temporal Convolutional Networks (TCN). Recursive Feature Elimination (RFE) was employed to identify the most relevant features. The performance of these models was evaluated across three strawberry varieties: Sensation, Brilliance, and Medallion. Results showed that MLR outperformed other models for Sensation and Brilliance, with R2 values of 0.633 and 0.908, respectively. For Medallion, GBM achieved the best performance with an R2 score of 0.848. LSTM, which outperformed TCN, achieved R2 scores of 0.522 (Sensation), 0.839 (Brilliance), and 0.740 (Medallion). This AI-driven system automates yield forecasting, reducing labor costs and enabling more efficient harvest planning. The study highlights the potential of combining machine vision and predictive analytics for precise, scalable yield prediction, offering valuable insights for proactive farm management and supply chain optimization.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.