Kodjo Abel Odah , Sèton Calmette Ariane Houetohossou , Vinasetan Ratheil Houndji , Romain Lucas Glèlè Kakaï
{"title":"Machine learning techniques for tomato yield prediction: A comprehensive analysis","authors":"Kodjo Abel Odah , Sèton Calmette Ariane Houetohossou , Vinasetan Ratheil Houndji , Romain Lucas Glèlè Kakaï","doi":"10.1016/j.atech.2025.101067","DOIUrl":null,"url":null,"abstract":"<div><div>Effective yield prediction is crucial for farmers and the agricultural sector. It allows producers to enhance control over their operations and better align with market supply and demand. With the emergence of Artificial Intelligence (AI), various Machine Learning (ML) models have been developed to predict crop yield. In this study, we conducted a systematic literature review to examine the ML models used for predicting tomato yield, the features associated with the most effective models, and the challenges faced by users. We retrieved 1,486 scientific papers from six electronic databases. Following the PRISMA guidelines, we included 57 studies in our analysis. The results showed that 66.67% of the models achieving the best performance in predicting or estimating tomato yield are Deep Learning (DL) models, with neural networks accounting for 42.11% of these. Specifically, Long Short-Term Memory (LSTM), Artificial Neural Networks (ANN), and Support Vector Regression (SVR) are the models most commonly used, demonstrating strong performance when considering factors such as climate, soil conditions, plant growth, fertilization, and irrigation. Additionally, when using computed vegetation indices from image data, Random Forest Regression (RFR) is frequently applied with notable success. The YOLO-Tomato and R-CNN methods are commonly used for detecting tomato fruits prior to yield estimation. Furthermore, DeepSort and linear regression are the predominant methods employed for counting and estimating tomato yield. For future research, it is important to conduct a comparative analysis of models such as LSTM, ANN, SVR, and RFR specifically for predicting tomato yield using data from Africa.</div></div>","PeriodicalId":74813,"journal":{"name":"Smart agricultural technology","volume":"12 ","pages":"Article 101067"},"PeriodicalIF":5.7000,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart agricultural technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772375525003004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Effective yield prediction is crucial for farmers and the agricultural sector. It allows producers to enhance control over their operations and better align with market supply and demand. With the emergence of Artificial Intelligence (AI), various Machine Learning (ML) models have been developed to predict crop yield. In this study, we conducted a systematic literature review to examine the ML models used for predicting tomato yield, the features associated with the most effective models, and the challenges faced by users. We retrieved 1,486 scientific papers from six electronic databases. Following the PRISMA guidelines, we included 57 studies in our analysis. The results showed that 66.67% of the models achieving the best performance in predicting or estimating tomato yield are Deep Learning (DL) models, with neural networks accounting for 42.11% of these. Specifically, Long Short-Term Memory (LSTM), Artificial Neural Networks (ANN), and Support Vector Regression (SVR) are the models most commonly used, demonstrating strong performance when considering factors such as climate, soil conditions, plant growth, fertilization, and irrigation. Additionally, when using computed vegetation indices from image data, Random Forest Regression (RFR) is frequently applied with notable success. The YOLO-Tomato and R-CNN methods are commonly used for detecting tomato fruits prior to yield estimation. Furthermore, DeepSort and linear regression are the predominant methods employed for counting and estimating tomato yield. For future research, it is important to conduct a comparative analysis of models such as LSTM, ANN, SVR, and RFR specifically for predicting tomato yield using data from Africa.