Enhancement of Machine-Learning-Based Flash Calculations near Criticality Using a Resampling Approach

IF 1.9 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Computation Pub Date : 2024-01-09 DOI:10.3390/computation12010010

Eirini Maria Kanakaki, Anna Samnioti, V. Gaganis

{"title":"Enhancement of Machine-Learning-Based Flash Calculations near Criticality Using a Resampling Approach","authors":"Eirini Maria Kanakaki, Anna Samnioti, V. Gaganis","doi":"10.3390/computation12010010","DOIUrl":null,"url":null,"abstract":"Flash calculations are essential in reservoir engineering applications, most notably in compositional flow simulation and separation processes, to provide phase distribution factors, known as k-values, at a given pressure and temperature. The calculation output is subsequently used to estimate composition-dependent properties of interest, such as the equilibrium phases’ molar fraction, composition, density, and compressibility. However, when the flash conditions approach criticality, minor inaccuracies in the computed k-values may lead to significant deviation in the dependent properties, which is eventually inherited to the simulator, leading to large errors in the simulation. Although several machine-learning-based regression approaches have emerged to drastically accelerate flash calculations, the criticality issue persists. To address this problem, a novel resampling technique of the ML models’ training data population is proposed, which aims to fine-tune the training dataset distribution and optimally exploit the models’ learning capacity across various flash conditions. The results demonstrate significantly improved accuracy in predicting phase behavior results near criticality, offering valuable contributions not only to the subsurface reservoir engineering industry but also to the broader field of thermodynamics. By understanding and optimizing the model’s training, this research enables more precise predictions and better-informed decision-making processes in domains involving phase separation phenomena. The proposed technique is applicable to every ML-dominated regression problem, where properties dependent on the machine output are of interest rather than the model output itself.","PeriodicalId":52148,"journal":{"name":"Computation","volume":"22 6","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/computation12010010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Flash calculations are essential in reservoir engineering applications, most notably in compositional flow simulation and separation processes, to provide phase distribution factors, known as k-values, at a given pressure and temperature. The calculation output is subsequently used to estimate composition-dependent properties of interest, such as the equilibrium phases’ molar fraction, composition, density, and compressibility. However, when the flash conditions approach criticality, minor inaccuracies in the computed k-values may lead to significant deviation in the dependent properties, which is eventually inherited to the simulator, leading to large errors in the simulation. Although several machine-learning-based regression approaches have emerged to drastically accelerate flash calculations, the criticality issue persists. To address this problem, a novel resampling technique of the ML models’ training data population is proposed, which aims to fine-tune the training dataset distribution and optimally exploit the models’ learning capacity across various flash conditions. The results demonstrate significantly improved accuracy in predicting phase behavior results near criticality, offering valuable contributions not only to the subsurface reservoir engineering industry but also to the broader field of thermodynamics. By understanding and optimizing the model’s training, this research enables more precise predictions and better-informed decision-making processes in domains involving phase separation phenomena. The proposed technique is applicable to every ML-dominated regression problem, where properties dependent on the machine output are of interest rather than the model output itself.

查看原文本刊更多论文

利用重采样方法改进基于机器学习的临界值闪存计算

闪存计算在储层工程应用中非常重要，尤其是在成分流模拟和分离过程中，它提供了在给定压力和温度下的相分布系数，即 k 值。计算输出随后用于估算与成分相关的相关特性，如平衡相的摩尔分数、成分、密度和可压缩性。然而，当闪蒸条件接近临界值时，计算 k 值中的微小误差可能会导致相关属性出现重大偏差，而这些偏差最终会遗传给模拟器，从而导致模拟出现较大误差。尽管已经出现了几种基于机器学习的回归方法，大大加快了闪存计算速度，但临界问题依然存在。为解决这一问题，我们提出了一种新颖的重采样技术，旨在微调训练数据集的分布，优化利用模型在各种闪光条件下的学习能力。结果表明，预测临界相行为结果的准确性大大提高，不仅为地下储层工程行业，而且为更广泛的热力学领域做出了宝贵贡献。通过了解和优化模型的训练，这项研究能够在涉及相分离现象的领域进行更精确的预测和更明智的决策过程。所提出的技术适用于所有以 ML 为主导的回归问题，在这些问题中，人们感兴趣的是依赖于机器输出的属性，而不是模型输出本身。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computation Mathematics-Applied Mathematics

CiteScore

3.50

自引率

4.50%

发文量

201

审稿时长

8 weeks

期刊介绍： Computation a journal of computational science and engineering. Topics: computational biology, including, but not limited to: bioinformatics mathematical modeling, simulation and prediction of nucleic acid (DNA/RNA) and protein sequences, structure and functions mathematical modeling of pathways and genetic interactions neuroscience computation including neural modeling, brain theory and neural networks computational chemistry, including, but not limited to: new theories and methodology including their applications in molecular dynamics computation of electronic structure density functional theory designing and characterization of materials with computation method computation in engineering, including, but not limited to: new theories, methodology and the application of computational fluid dynamics (CFD) optimisation techniques and/or application of optimisation to multidisciplinary systems system identification and reduced order modelling of engineering systems parallel algorithms and high performance computing in engineering.