Chouaib El Hachimi , Salwa Belaqziz , Saïd Khabba , Andre Daccache , Bouchra Ait Hssaine , Hasan Karjoun , Youness Ouassanouan , Badreddine Sebbar , Mohamed Hakim Kharrou , Salah Er-Raki , Abdelghani Chehbouni
{"title":"Physics-informed neural networks for enhanced reference evapotranspiration estimation in Morocco: Balancing semi-physical models and deep learning","authors":"Chouaib El Hachimi , Salwa Belaqziz , Saïd Khabba , Andre Daccache , Bouchra Ait Hssaine , Hasan Karjoun , Youness Ouassanouan , Badreddine Sebbar , Mohamed Hakim Kharrou , Salah Er-Raki , Abdelghani Chehbouni","doi":"10.1016/j.chemosphere.2025.144238","DOIUrl":null,"url":null,"abstract":"<div><div>Reference evapotranspiration (<span><math><mrow><mi>E</mi><mi>T</mi><mi>o</mi></mrow></math></span>) is essential for agricultural water management, crop productivity, and irrigation systems. The Penman-Monteith (PM) equation is the standard method for estimating <span><math><mrow><mi>E</mi><mi>T</mi><mi>o</mi></mrow></math></span>, but its data-intensive nature makes it impractical, especially in situations where the cost of full standardized weather station is prohibitive, maintenance is inadequate, or data quality and continuity are compromised. To overcome those limitations, various semi-physical (SP) and empirical models with limited weather parameters were developed. In this context, artificial intelligence methods for <span><math><mrow><mi>E</mi><mi>T</mi><mi>o</mi></mrow></math></span> estimation are gaining more attention, balancing simplicity, minimal data requirements, and high accuracy. However, their data-driven nature raises concerns regarding explainability, trustworthiness, adherence to bio-physical laws, and reliability in operational settings. To address this issue, this paper, inspired by the emerging field of Physics-Informed Neural Networks (PINNs), evaluates the integration of SP models into the loss function during the learning process. The new residual loss combines two losses –the data-driven loss and the loss from SP– through a <em>θ</em> parameter, allowing for a convex combination. In-situ agrometeorological data were collected at four automatic weather stations in Tensift Watershed in Morocco, including air temperature (<em>Ta</em>), solar radiation (<em>Rs</em>), relative humidity (<em>RH</em>), and wind speed (<em>Ws</em>). The study integrates Priestley-Taylor (PT), Makkink (MK), Hargreaves-Samani (HS), and Abtew (AB), under four scenarios of data availability levels: (1) <em>Ta</em>, <em>Rs</em> and <em>RH</em>; (2) <em>Ta</em> and <em>Rs</em>; (3) only <em>Ta</em>; and (4) only <em>Rs</em>. The investigation begins with quality-controlling the data and studying the driving factors of <span><math><mrow><mi>E</mi><mi>T</mi><mi>o</mi></mrow></math></span>. Next, the SP models were calibrated using the CMA-ES optimization algorithm. The proposed PINN was trained and evaluated, first, for the equal contribution scenario (<em>θ</em> = 0.5) and then for <em>θ</em> in the interval [0, 1] with a step of 0.2, thus analyzing the impact of <em>θ</em> on the PINN performance. For the equal contribution, the results showed that the integration had improved the PINN performance in all scenarios in terms of the <em>RMSE</em> and <em>R</em><sup><em>2</em></sup>, surpassing the fully data-driven model (<em>θ</em> = 0) and the baseline model (<em>θ</em> = 1). Additionally, for all <em>θ</em> within the interval [0.2, 0.8], the PINN required less training to reach optimal values. Finally, the optimal <em>θ</em> values were determined for each scenario using CMA-ES and were 0.258, 0.771, 0.7226 and 0.169 for PT, MK, HS and AB, respectively. While PINNs demonstrated a promising approach for accurate <span><math><mrow><mi>E</mi><mi>T</mi><mi>o</mi></mrow></math></span> estimation and consequently improved water resource management, the study also represents a step towards implementing controlled, trustworthy, and physics-informed AI in environmental science.</div></div>","PeriodicalId":276,"journal":{"name":"Chemosphere","volume":"374 ","pages":"Article 144238"},"PeriodicalIF":8.1000,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemosphere","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045653525001808","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Reference evapotranspiration () is essential for agricultural water management, crop productivity, and irrigation systems. The Penman-Monteith (PM) equation is the standard method for estimating , but its data-intensive nature makes it impractical, especially in situations where the cost of full standardized weather station is prohibitive, maintenance is inadequate, or data quality and continuity are compromised. To overcome those limitations, various semi-physical (SP) and empirical models with limited weather parameters were developed. In this context, artificial intelligence methods for estimation are gaining more attention, balancing simplicity, minimal data requirements, and high accuracy. However, their data-driven nature raises concerns regarding explainability, trustworthiness, adherence to bio-physical laws, and reliability in operational settings. To address this issue, this paper, inspired by the emerging field of Physics-Informed Neural Networks (PINNs), evaluates the integration of SP models into the loss function during the learning process. The new residual loss combines two losses –the data-driven loss and the loss from SP– through a θ parameter, allowing for a convex combination. In-situ agrometeorological data were collected at four automatic weather stations in Tensift Watershed in Morocco, including air temperature (Ta), solar radiation (Rs), relative humidity (RH), and wind speed (Ws). The study integrates Priestley-Taylor (PT), Makkink (MK), Hargreaves-Samani (HS), and Abtew (AB), under four scenarios of data availability levels: (1) Ta, Rs and RH; (2) Ta and Rs; (3) only Ta; and (4) only Rs. The investigation begins with quality-controlling the data and studying the driving factors of . Next, the SP models were calibrated using the CMA-ES optimization algorithm. The proposed PINN was trained and evaluated, first, for the equal contribution scenario (θ = 0.5) and then for θ in the interval [0, 1] with a step of 0.2, thus analyzing the impact of θ on the PINN performance. For the equal contribution, the results showed that the integration had improved the PINN performance in all scenarios in terms of the RMSE and R2, surpassing the fully data-driven model (θ = 0) and the baseline model (θ = 1). Additionally, for all θ within the interval [0.2, 0.8], the PINN required less training to reach optimal values. Finally, the optimal θ values were determined for each scenario using CMA-ES and were 0.258, 0.771, 0.7226 and 0.169 for PT, MK, HS and AB, respectively. While PINNs demonstrated a promising approach for accurate estimation and consequently improved water resource management, the study also represents a step towards implementing controlled, trustworthy, and physics-informed AI in environmental science.
期刊介绍:
Chemosphere, being an international multidisciplinary journal, is dedicated to publishing original communications and review articles on chemicals in the environment. The scope covers a wide range of topics, including the identification, quantification, behavior, fate, toxicology, treatment, and remediation of chemicals in the bio-, hydro-, litho-, and atmosphere, ensuring the broad dissemination of research in this field.