Enhancing machine learning thermobarometry for clinopyroxene-bearing magmas

IF 4.2 2区 地球科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Mónica Ágreda-López , Valerio Parodi , Alessandro Musu , Corin Jorgenson , Alessandro Carfì , Fulvio Mastrogiovanni , Luca Caricchi , Diego Perugini , Maurizio Petrelli
{"title":"Enhancing machine learning thermobarometry for clinopyroxene-bearing magmas","authors":"Mónica Ágreda-López ,&nbsp;Valerio Parodi ,&nbsp;Alessandro Musu ,&nbsp;Corin Jorgenson ,&nbsp;Alessandro Carfì ,&nbsp;Fulvio Mastrogiovanni ,&nbsp;Luca Caricchi ,&nbsp;Diego Perugini ,&nbsp;Maurizio Petrelli","doi":"10.1016/j.cageo.2024.105707","DOIUrl":null,"url":null,"abstract":"<div><p>In this study, we proposed a general workflow that aims to enhance the ML-based geothermobarometer modelling. Our workflow focuses on three key areas. Firstly, we developed a robust pre-processing pipeline that addresses data imbalance, feature engineering, and data augmentation. Secondly, we assessed modelling errors using a Monte Carlo approach to quantify the impact of analytical uncertainties on the final pressure and temperature estimates. Thirdly, we implemented a robust strategy to validate and test the ML models to avoid over- and under-fitting issues while correcting biases associated with the application of specific ML models (i.e., tree-based ensembles).</p><p>To facilitate the use of our workflow, we have developed a web app (<span><span>https://bit.ly/ml-pt-web</span><svg><path></path></svg></span>) and a Python module (<span><span>https://bit.ly/ml-pt-py</span><svg><path></path></svg></span>). The robustness of this strategy has been tested on two calibrations: clinopyroxene (cpx) and clinopyroxene-liquid (cpx-liq). Our results show a significant reduction in errors compared to the baseline model, as well as good generalization ability on an independent external dataset. The Root Mean Squared Errors are 57 °C and 2.5 kbar for the cpx calibration, and 36 °C and 2.1 kbar for the cpx-liq calibration. Finally, our models show improved outcomes on the external dataset compared to existing ML and classical cpx and cpx-liq thermobarometers.</p></div>","PeriodicalId":55221,"journal":{"name":"Computers & Geosciences","volume":"193 ","pages":"Article 105707"},"PeriodicalIF":4.2000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0098300424001900/pdfft?md5=35a76aa189a72d9015dd976686c4e57f&pid=1-s2.0-S0098300424001900-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Geosciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098300424001900","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

In this study, we proposed a general workflow that aims to enhance the ML-based geothermobarometer modelling. Our workflow focuses on three key areas. Firstly, we developed a robust pre-processing pipeline that addresses data imbalance, feature engineering, and data augmentation. Secondly, we assessed modelling errors using a Monte Carlo approach to quantify the impact of analytical uncertainties on the final pressure and temperature estimates. Thirdly, we implemented a robust strategy to validate and test the ML models to avoid over- and under-fitting issues while correcting biases associated with the application of specific ML models (i.e., tree-based ensembles).

To facilitate the use of our workflow, we have developed a web app (https://bit.ly/ml-pt-web) and a Python module (https://bit.ly/ml-pt-py). The robustness of this strategy has been tested on two calibrations: clinopyroxene (cpx) and clinopyroxene-liquid (cpx-liq). Our results show a significant reduction in errors compared to the baseline model, as well as good generalization ability on an independent external dataset. The Root Mean Squared Errors are 57 °C and 2.5 kbar for the cpx calibration, and 36 °C and 2.1 kbar for the cpx-liq calibration. Finally, our models show improved outcomes on the external dataset compared to existing ML and classical cpx and cpx-liq thermobarometers.

增强含烊辉石岩浆的机器学习热压测量法
在本研究中,我们提出了一个通用工作流程,旨在增强基于 ML 的地温热压计建模。我们的工作流程侧重于三个关键领域。首先,我们开发了一个强大的预处理管道,以解决数据不平衡、特征工程和数据增强等问题。其次,我们使用蒙特卡罗方法评估建模误差,量化分析不确定性对最终压力和温度估计值的影响。第三,我们实施了一种稳健的策略来验证和测试 ML 模型,以避免过度拟合和拟合不足的问题,同时纠正与应用特定 ML 模型(即基于树的集合)相关的偏差。为了方便使用我们的工作流程,我们开发了一个网络应用程序 (https://bit.ly/ml-pt-web) 和一个 Python 模块 (https://bit.ly/ml-pt-py)。我们在两个定标中测试了这一策略的稳健性:clinopyroxene (cpx) 和 clinopyroxene-liquid (cpx-liq)。结果表明,与基线模型相比,误差明显减少,而且在独立的外部数据集上具有良好的泛化能力。cpx 标定的均方根误差为 57 ℃ 和 2.5 千巴,cpx-liq 标定的均方根误差为 36 ℃ 和 2.1 千巴。最后,与现有的 ML 和经典 cpx 和 cpx-liq 温度计相比,我们的模型在外部数据集上显示出更好的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers & Geosciences
Computers & Geosciences 地学-地球科学综合
CiteScore
9.30
自引率
6.80%
发文量
164
审稿时长
3.4 months
期刊介绍: Computers & Geosciences publishes high impact, original research at the interface between Computer Sciences and Geosciences. Publications should apply modern computer science paradigms, whether computational or informatics-based, to address problems in the geosciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信