Causal hybrid modeling with double machine learning—applications in carbon flux modeling

IF 6.3 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Kai-Hendrik Cohrs, Gherardo Varando, Nuno Carvalhais, Markus Reichstein and Gustau Camps-Valls
{"title":"Causal hybrid modeling with double machine learning—applications in carbon flux modeling","authors":"Kai-Hendrik Cohrs, Gherardo Varando, Nuno Carvalhais, Markus Reichstein and Gustau Camps-Valls","doi":"10.1088/2632-2153/ad5a60","DOIUrl":null,"url":null,"abstract":"Hybrid modeling integrates machine learning with scientific knowledge to enhance interpretability, generalization, and adherence to natural laws. Nevertheless, equifinality and regularization biases pose challenges in hybrid modeling to achieve these purposes. This paper introduces a novel approach to estimating hybrid models via a causal inference framework, specifically employing double machine learning (DML) to estimate causal effects. We showcase its use for the Earth sciences on two problems related to carbon dioxide fluxes. In the Q10 model, we demonstrate that DML-based hybrid modeling is superior in estimating causal parameters over end-to-end deep neural network approaches, proving efficiency, robustness to bias from regularization methods, and circumventing equifinality. Our approach, applied to carbon flux partitioning, exhibits flexibility in accommodating heterogeneous causal effects. The study emphasizes the necessity of explicitly defining causal graphs and relationships, advocating for this as a general best practice. We encourage the continued exploration of causality in hybrid models for more interpretable and trustworthy results in knowledge-guided machine learning.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":null,"pages":null},"PeriodicalIF":6.3000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad5a60","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Hybrid modeling integrates machine learning with scientific knowledge to enhance interpretability, generalization, and adherence to natural laws. Nevertheless, equifinality and regularization biases pose challenges in hybrid modeling to achieve these purposes. This paper introduces a novel approach to estimating hybrid models via a causal inference framework, specifically employing double machine learning (DML) to estimate causal effects. We showcase its use for the Earth sciences on two problems related to carbon dioxide fluxes. In the Q10 model, we demonstrate that DML-based hybrid modeling is superior in estimating causal parameters over end-to-end deep neural network approaches, proving efficiency, robustness to bias from regularization methods, and circumventing equifinality. Our approach, applied to carbon flux partitioning, exhibits flexibility in accommodating heterogeneous causal effects. The study emphasizes the necessity of explicitly defining causal graphs and relationships, advocating for this as a general best practice. We encourage the continued exploration of causality in hybrid models for more interpretable and trustworthy results in knowledge-guided machine learning.
双机器学习的因果混合建模--在碳通量建模中的应用
混合建模将机器学习与科学知识相结合,以增强可解释性、概括性和对自然规律的遵循。然而,等价性和正则化偏差给混合建模实现这些目的带来了挑战。本文介绍了一种通过因果推理框架来估计混合模型的新方法,特别是采用双重机器学习(DML)来估计因果效应。我们在两个与二氧化碳通量有关的问题上展示了这种方法在地球科学中的应用。在 Q10 模型中,我们证明了基于 DML 的混合建模在估计因果参数方面优于端到端深度神经网络方法,证明了其效率、对正则化方法产生的偏差的鲁棒性以及规避等效性。我们的方法适用于碳通量分区,在适应异质因果效应方面表现出灵活性。该研究强调了明确定义因果图和因果关系的必要性,并倡导将此作为一般最佳实践。我们鼓励在混合模型中继续探索因果关系,以便在知识引导的机器学习中获得更可解释、更可信的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Machine Learning Science and Technology
Machine Learning Science and Technology Computer Science-Artificial Intelligence
CiteScore
9.10
自引率
4.40%
发文量
86
审稿时长
5 weeks
期刊介绍: Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信