A Comparative Study on Fluid Composition Determination from Near Infrared Spectra Using Deep Convolutional Neural Networks and Partial Least Squares Regression

W. Weinzierl, A. Cartellieri, P. Schapotschnikow
{"title":"A Comparative Study on Fluid Composition Determination from Near Infrared Spectra Using Deep Convolutional Neural Networks and Partial Least Squares Regression","authors":"W. Weinzierl, A. Cartellieri, P. Schapotschnikow","doi":"10.2523/iptc-23264-ms","DOIUrl":null,"url":null,"abstract":"\n The conventional approach to fluid characterization using partial least squares (PLS) is considered a benchmark in chemometric fluid analysis. Complementary, convolutional neural networks (CNN) have been shown to provide comparable discrimination capabilities. In a comparative study, the performance for quantitative characterization of downhole fluids using near-infrared (NIR) spectra has been evaluated. Both methods are used to predict the fluid composition in fractions of water, gas, oil, and mud. PLS is a statistical technique designed to model the relationship between two sets of variables, in this case between the spectrum and the composition. It relies on the representation of the variables in a multidimensional latent space. Usually, the inference consists of three steps. First, the input (spectrum) is linearly projected into the latent space. Second, the output is calculated in the latent space. Finally, the composition is computed as a linear transformation of the latent output. Instead of using PLS for end-to-end inference, only its first step has been used for feature extraction. By using the first latent dimension for each component, features were obtained that can be conveniently associated with water, gas and oil respectively. These features are then used together with the constant baseline in a multinomial logistic regression to obtain fractional components of the present fluid types in the NIR spectra. The baseline is primarily needed for mud detection. In parallel, several CNN models were trained for fluid characterization based on NIR spectra on processed and raw data. Hyper-parameter optimization of the CNN's is performed using a tree structured Parzen estimator to obtain a best trial configuration. Scheduling of the optimization loop yielded improved inference results. Quantitative comparison of the PLS and CNN models was performed using a k-fold approach. This allows for a direct comparison of the methods performance given as input spectra of pure and mixed fluids. Both methods show high accuracy when predicting pure components. The root mean square error (RMSE) is consistently larger for PLS. The CNN models generally show larger variance in the prediction for mud, with minor fractions of water, gas and oil being inferred. A quantitative comparison of two methods in chemometric fluid analysis shows an overall improvement of predictive power for a set of deep CNN in respect to the PLS approach. Improved inference is achieved using raw NIR spectral data. This is particularly interesting as no further pre-processing of the spectra is required, thereby minimizing porting efforts in the development of embedded applications.","PeriodicalId":518539,"journal":{"name":"Day 3 Wed, February 14, 2024","volume":"61 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 3 Wed, February 14, 2024","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2523/iptc-23264-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The conventional approach to fluid characterization using partial least squares (PLS) is considered a benchmark in chemometric fluid analysis. Complementary, convolutional neural networks (CNN) have been shown to provide comparable discrimination capabilities. In a comparative study, the performance for quantitative characterization of downhole fluids using near-infrared (NIR) spectra has been evaluated. Both methods are used to predict the fluid composition in fractions of water, gas, oil, and mud. PLS is a statistical technique designed to model the relationship between two sets of variables, in this case between the spectrum and the composition. It relies on the representation of the variables in a multidimensional latent space. Usually, the inference consists of three steps. First, the input (spectrum) is linearly projected into the latent space. Second, the output is calculated in the latent space. Finally, the composition is computed as a linear transformation of the latent output. Instead of using PLS for end-to-end inference, only its first step has been used for feature extraction. By using the first latent dimension for each component, features were obtained that can be conveniently associated with water, gas and oil respectively. These features are then used together with the constant baseline in a multinomial logistic regression to obtain fractional components of the present fluid types in the NIR spectra. The baseline is primarily needed for mud detection. In parallel, several CNN models were trained for fluid characterization based on NIR spectra on processed and raw data. Hyper-parameter optimization of the CNN's is performed using a tree structured Parzen estimator to obtain a best trial configuration. Scheduling of the optimization loop yielded improved inference results. Quantitative comparison of the PLS and CNN models was performed using a k-fold approach. This allows for a direct comparison of the methods performance given as input spectra of pure and mixed fluids. Both methods show high accuracy when predicting pure components. The root mean square error (RMSE) is consistently larger for PLS. The CNN models generally show larger variance in the prediction for mud, with minor fractions of water, gas and oil being inferred. A quantitative comparison of two methods in chemometric fluid analysis shows an overall improvement of predictive power for a set of deep CNN in respect to the PLS approach. Improved inference is achieved using raw NIR spectral data. This is particularly interesting as no further pre-processing of the spectra is required, thereby minimizing porting efforts in the development of embedded applications.
利用深度卷积神经网络和部分最小二乘回归从近红外光谱测定流体成分的比较研究
使用偏最小二乘法(PLS)进行流体表征的传统方法被认为是流体化学计量分析的基准。作为补充,卷积神经网络(CNN)已被证明具有相当的识别能力。在一项比较研究中,对使用近红外(NIR)光谱定量表征井下流体的性能进行了评估。这两种方法都用于预测水、气、油和泥浆馏分中的流体成分。PLS 是一种统计技术,旨在模拟两组变量之间的关系,在本例中是光谱与成分之间的关系。它依赖于变量在多维潜空间中的表示。推理通常包括三个步骤。首先,将输入(频谱)线性地投射到潜在空间中。其次,在潜在空间中计算输出。最后,通过对潜在输出的线性变换计算出组成。我们没有使用 PLS 进行端到端推理,而是只将其第一步用于特征提取。通过使用每个成分的第一个潜维度,可以方便地获得分别与水、气和油相关联的特征。然后将这些特征与恒定基线一起用于多项式逻辑回归,以获得近红外光谱中现有流体类型的分数成分。基线主要用于泥浆检测。与此同时,还对多个 CNN 模型进行了训练,以便根据处理过的和原始数据的近红外光谱进行流体表征。使用树状结构的 Parzen 估计器对 CNN 进行超参数优化,以获得最佳试验配置。优化循环的调度改善了推理结果。使用 k 折方法对 PLS 和 CNN 模型进行了定量比较。这样就可以在输入纯流体和混合流体光谱的情况下直接比较两种方法的性能。在预测纯成分时,两种方法都显示出较高的准确性。PLS 的均方根误差 (RMSE) 一直较大。CNN 模型对泥浆的预测一般显示出较大的差异,可推测出水、气和油的小部分成分。化学计量流体分析中两种方法的定量比较显示,与 PLS 方法相比,一组深度 CNN 的预测能力总体上有所提高。使用原始近红外光谱数据实现了更高的推断能力。这一点尤其有趣,因为无需对光谱进行进一步的预处理,从而最大限度地减少了嵌入式应用程序开发过程中的移植工作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信