Universal Representations for Well-Logging Data via Ensembling of Self-Supervised Models

IF 0.5 4区 数学 Q3 MATHEMATICS
V. A. Zholobov, E. D. Romanenkova, S. A. Egorov, N. A. Gevorgyan, A. A. Zaytsev
{"title":"Universal Representations for Well-Logging Data via Ensembling of Self-Supervised Models","authors":"V. A. Zholobov,&nbsp;E. D. Romanenkova,&nbsp;S. A. Egorov,&nbsp;N. A. Gevorgyan,&nbsp;A. A. Zaytsev","doi":"10.1134/S1064562424602257","DOIUrl":null,"url":null,"abstract":"<p>Time series representation learning is crucial in applications requiring sophisticated data analysis. In some areas, like the Oil and Gas industry, the problem is particularly challenging due to missing values and anomalous samples caused by sensor failures in highly complex manufacturing environments. Self-supervised learning is one of the most popular solutions for obtaining data representation. However, being either generative or contrastive, these methods suffer from the limited applicability of obtained embeddings, – so general usage is more often declared than achieved.</p><p>This study introduces and examines various generative self-supervised architectures for complex industrial time series. Moreover, we propose a new way to ensemble several generative approaches, leveraging the best advantages of each method. The suggested procedure is designed to tackle a wide range of scenarios with missing and multiscale data.</p><p>For numerical experiments, we use various-scale datasets of well logs from diverse oilfields. Evaluation includes change point detection, clustering, and transfer learning, with the last two problems being introduced for the first time. It shows that variational autoencoders excel in clustering, autoregressive models better detect change points, and the proposed ensemble succeeds in both tasks.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"110 1 supplement","pages":"S126 - S136"},"PeriodicalIF":0.5000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1134/S1064562424602257.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Doklady Mathematics","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1134/S1064562424602257","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Time series representation learning is crucial in applications requiring sophisticated data analysis. In some areas, like the Oil and Gas industry, the problem is particularly challenging due to missing values and anomalous samples caused by sensor failures in highly complex manufacturing environments. Self-supervised learning is one of the most popular solutions for obtaining data representation. However, being either generative or contrastive, these methods suffer from the limited applicability of obtained embeddings, – so general usage is more often declared than achieved.

This study introduces and examines various generative self-supervised architectures for complex industrial time series. Moreover, we propose a new way to ensemble several generative approaches, leveraging the best advantages of each method. The suggested procedure is designed to tackle a wide range of scenarios with missing and multiscale data.

For numerical experiments, we use various-scale datasets of well logs from diverse oilfields. Evaluation includes change point detection, clustering, and transfer learning, with the last two problems being introduced for the first time. It shows that variational autoencoders excel in clustering, autoregressive models better detect change points, and the proposed ensemble succeeds in both tasks.

在需要复杂数据分析的应用中,时间序列表示学习至关重要。在某些领域,如石油和天然气行业,由于高度复杂的生产环境中传感器故障导致的缺失值和异常样本,这个问题尤其具有挑战性。自我监督学习是获得数据表示最常用的解决方案之一。然而,这些方法要么是生成式的,要么是对比式的,都存在所获得嵌入的适用性有限的问题,因此普遍使用的情况往往是声明多于实现。此外,我们还提出了一种将多种生成方法集合起来的新方法,充分利用每种方法的最佳优势。在数值实验中,我们使用了来自不同油田的各种规模的测井数据集。评估包括变化点检测、聚类和迁移学习,其中后两个问题是首次引入。结果表明,变分自编码器在聚类方面表现出色,自回归模型能更好地检测变化点,而提议的集合在这两项任务中都取得了成功。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Doklady Mathematics
Doklady Mathematics 数学-数学
CiteScore
1.00
自引率
16.70%
发文量
39
审稿时长
3-6 weeks
期刊介绍: Doklady Mathematics is a journal of the Presidium of the Russian Academy of Sciences. It contains English translations of papers published in Doklady Akademii Nauk (Proceedings of the Russian Academy of Sciences), which was founded in 1933 and is published 36 times a year. Doklady Mathematics includes the materials from the following areas: mathematics, mathematical physics, computer science, control theory, and computers. It publishes brief scientific reports on previously unpublished significant new research in mathematics and its applications. The main contributors to the journal are Members of the RAS, Corresponding Members of the RAS, and scientists from the former Soviet Union and other foreign countries. Among the contributors are the outstanding Russian mathematicians.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信