Physics-Guided Machine Learning from Simulation Data: An Application in Modeling Lake and River Systems

X. Jia, Yiqun Xie, Sheng Li, Shengyu Chen, J. Zwart, J. Sadler, A. Appling, S. Oliver, J. Read
{"title":"Physics-Guided Machine Learning from Simulation Data: An Application in Modeling Lake and River Systems","authors":"X. Jia, Yiqun Xie, Sheng Li, Shengyu Chen, J. Zwart, J. Sadler, A. Appling, S. Oliver, J. Read","doi":"10.1109/ICDM51629.2021.00037","DOIUrl":null,"url":null,"abstract":"This paper proposes a new physics-guided machine learning approach that incorporates the scientific knowledge in physics-based models into machine learning models. Physics-based models are widely used to study dynamical systems in a variety of scientific and engineering problems. Although they are built based on general physical laws that govern the relations from input to output variables, these models often produce biased simulations due to inaccurate parameterizations or approximations used to represent the true physics. In this paper, we aim to build a new data-driven framework to monitor dynamical systems by extracting general scientific knowledge embodied in simulation data generated by the physics-based model. To handle the bias in simulation data caused by imperfect parameterization, we propose to extract general physical relations jointly from multiple sets of simulations generated by a physics-based model under different physical parameters. In particular, we develop a spatio-temporal network architecture that uses its gating variables to capture the variation of physical parameters. We initialize this model using a pre-training strategy that helps discover common physical patterns shared by different sets of simulation data. Then we fine-tune it using limited observation data via a contrastive learning process. By leveraging the complementary strength of machine learning and domain knowledge, our method has been shown to produce accurate predictions, use less training samples and generalize to out-of-sample scenarios. We further show that the method can provide insights about the variation of physical parameters over space and time in two domain applications: predicting temperature in streams and predicting temperature in lakes.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

This paper proposes a new physics-guided machine learning approach that incorporates the scientific knowledge in physics-based models into machine learning models. Physics-based models are widely used to study dynamical systems in a variety of scientific and engineering problems. Although they are built based on general physical laws that govern the relations from input to output variables, these models often produce biased simulations due to inaccurate parameterizations or approximations used to represent the true physics. In this paper, we aim to build a new data-driven framework to monitor dynamical systems by extracting general scientific knowledge embodied in simulation data generated by the physics-based model. To handle the bias in simulation data caused by imperfect parameterization, we propose to extract general physical relations jointly from multiple sets of simulations generated by a physics-based model under different physical parameters. In particular, we develop a spatio-temporal network architecture that uses its gating variables to capture the variation of physical parameters. We initialize this model using a pre-training strategy that helps discover common physical patterns shared by different sets of simulation data. Then we fine-tune it using limited observation data via a contrastive learning process. By leveraging the complementary strength of machine learning and domain knowledge, our method has been shown to produce accurate predictions, use less training samples and generalize to out-of-sample scenarios. We further show that the method can provide insights about the variation of physical parameters over space and time in two domain applications: predicting temperature in streams and predicting temperature in lakes.
基于模拟数据的物理引导机器学习:在湖泊和河流系统建模中的应用
本文提出了一种新的物理引导机器学习方法,将基于物理的模型中的科学知识整合到机器学习模型中。基于物理的模型被广泛用于研究各种科学和工程问题中的动力系统。尽管它们是基于控制从输入到输出变量关系的一般物理定律构建的,但由于用于表示真实物理的参数化或近似值不准确,这些模型经常产生有偏差的模拟。在本文中,我们的目标是通过提取物理模型生成的仿真数据中包含的一般科学知识,构建一个新的数据驱动框架来监测动态系统。为了处理参数化不完善导致的仿真数据偏差,我们提出从不同物理参数下基于物理模型生成的多组仿真数据中联合提取一般物理关系。特别是,我们开发了一个时空网络架构,该架构使用其门控变量来捕获物理参数的变化。我们使用预训练策略初始化该模型,该策略有助于发现不同模拟数据集共享的共同物理模式。然后我们通过对比学习过程使用有限的观察数据对其进行微调。通过利用机器学习和领域知识的互补优势,我们的方法已被证明可以产生准确的预测,使用更少的训练样本并推广到样本外场景。我们进一步表明,该方法可以在两个领域的应用中提供关于物理参数随空间和时间变化的见解:预测河流温度和预测湖泊温度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信