Huihan Wang , Yuanshuai Dai , Qiushuang Yao , Lulu Ma , Ze Zhang , Xin Lv
{"title":"气候与遥感数据协同驱动的多任务学习模型在棉花季中产量预测中的应用","authors":"Huihan Wang , Yuanshuai Dai , Qiushuang Yao , Lulu Ma , Ze Zhang , Xin Lv","doi":"10.1016/j.fcr.2025.110070","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate prediction of cotton yield is critical for agricultural policy, production management, and food security. We aimed to enhance regional-scale cotton yield estimation by clarifying the respective contributions of climate and remote sensing variables and identifying optimal time windows for early prediction. We focused on the 8th Division of the Xinjiang Production and Construction Corps in China, using field survey data, Sentinel-2A imagery, and meteorological records from 2021 and 2023. Key variables were selected using Sequential Forward Selection and Structural Equation Modeling. Partial Least Squares Regression (PLSR), Random Forest, and XGBoost models were developed to estimate cotton yields and assess the performance of different data combinations and time periods. Additionally, a multi-task learning (MTL) framework was proposed to support dynamic early-season yield prediction, with 15-day interval time windows. Results showed that climate factors indirectly influenced yield by affecting vegetation status, while remote sensing data contributed significantly to prediction accuracy, particularly during key growth stages. Climate data alone generally outperformed remote sensing data, although their combination consistently improved model accuracy and stability. PLSR achieved the best performance at the T6 window (flowering and boll-setting stage) with R<sup>2</sup> = 0.60 and RMSE = 605.7 kg/ha. The MTL model demonstrated increasing accuracy as the season progressed, achieving optimal performance 60 days before harvest (R<sup>2</sup> = 0.71, RMSE = 519.7 kg/ha). We provide a cost-effective, timely, and simple framework for predicting cotton yields at a regional scale using publicly available data. The findings support improved agricultural production management and contribute to food security initiatives.</div></div>","PeriodicalId":12143,"journal":{"name":"Field Crops Research","volume":"333 ","pages":"Article 110070"},"PeriodicalIF":5.6000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-task learning model driven by climate and remote sensing data collaboration for mid-season cotton yield prediction\",\"authors\":\"Huihan Wang , Yuanshuai Dai , Qiushuang Yao , Lulu Ma , Ze Zhang , Xin Lv\",\"doi\":\"10.1016/j.fcr.2025.110070\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Accurate prediction of cotton yield is critical for agricultural policy, production management, and food security. We aimed to enhance regional-scale cotton yield estimation by clarifying the respective contributions of climate and remote sensing variables and identifying optimal time windows for early prediction. We focused on the 8th Division of the Xinjiang Production and Construction Corps in China, using field survey data, Sentinel-2A imagery, and meteorological records from 2021 and 2023. Key variables were selected using Sequential Forward Selection and Structural Equation Modeling. Partial Least Squares Regression (PLSR), Random Forest, and XGBoost models were developed to estimate cotton yields and assess the performance of different data combinations and time periods. Additionally, a multi-task learning (MTL) framework was proposed to support dynamic early-season yield prediction, with 15-day interval time windows. Results showed that climate factors indirectly influenced yield by affecting vegetation status, while remote sensing data contributed significantly to prediction accuracy, particularly during key growth stages. Climate data alone generally outperformed remote sensing data, although their combination consistently improved model accuracy and stability. PLSR achieved the best performance at the T6 window (flowering and boll-setting stage) with R<sup>2</sup> = 0.60 and RMSE = 605.7 kg/ha. The MTL model demonstrated increasing accuracy as the season progressed, achieving optimal performance 60 days before harvest (R<sup>2</sup> = 0.71, RMSE = 519.7 kg/ha). We provide a cost-effective, timely, and simple framework for predicting cotton yields at a regional scale using publicly available data. The findings support improved agricultural production management and contribute to food security initiatives.</div></div>\",\"PeriodicalId\":12143,\"journal\":{\"name\":\"Field Crops Research\",\"volume\":\"333 \",\"pages\":\"Article 110070\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Field Crops Research\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0378429025003351\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRONOMY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Field Crops Research","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378429025003351","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
Multi-task learning model driven by climate and remote sensing data collaboration for mid-season cotton yield prediction
Accurate prediction of cotton yield is critical for agricultural policy, production management, and food security. We aimed to enhance regional-scale cotton yield estimation by clarifying the respective contributions of climate and remote sensing variables and identifying optimal time windows for early prediction. We focused on the 8th Division of the Xinjiang Production and Construction Corps in China, using field survey data, Sentinel-2A imagery, and meteorological records from 2021 and 2023. Key variables were selected using Sequential Forward Selection and Structural Equation Modeling. Partial Least Squares Regression (PLSR), Random Forest, and XGBoost models were developed to estimate cotton yields and assess the performance of different data combinations and time periods. Additionally, a multi-task learning (MTL) framework was proposed to support dynamic early-season yield prediction, with 15-day interval time windows. Results showed that climate factors indirectly influenced yield by affecting vegetation status, while remote sensing data contributed significantly to prediction accuracy, particularly during key growth stages. Climate data alone generally outperformed remote sensing data, although their combination consistently improved model accuracy and stability. PLSR achieved the best performance at the T6 window (flowering and boll-setting stage) with R2 = 0.60 and RMSE = 605.7 kg/ha. The MTL model demonstrated increasing accuracy as the season progressed, achieving optimal performance 60 days before harvest (R2 = 0.71, RMSE = 519.7 kg/ha). We provide a cost-effective, timely, and simple framework for predicting cotton yields at a regional scale using publicly available data. The findings support improved agricultural production management and contribute to food security initiatives.
期刊介绍:
Field Crops Research is an international journal publishing scientific articles on:
√ experimental and modelling research at field, farm and landscape levels
on temperate and tropical crops and cropping systems,
with a focus on crop ecology and physiology, agronomy, and plant genetics and breeding.