Digital core: neural network recognition of textual geological and geophysical information

Yu. E. Katanov, A. I. Aristov, A. K. Yagafarov, O. D. Novruzov
{"title":"Digital core: neural network recognition of textual geological and geophysical information","authors":"Yu. E. Katanov, A. I. Aristov, A. K. Yagafarov, O. D. Novruzov","doi":"10.31660/0445-0108-2023-2-35-54","DOIUrl":null,"url":null,"abstract":"The algorithm of analog-to-digital conversion of primary geological and geophysical information (on the example of identification of rock lithotypes based on the text description of the physical core) is presented.As part of the work, a combination of three types of scientific research - prospecting, interdisciplinary and applied, in the formation of the initial base of qualitative data is implemented.Common algorithms for textual information classification and mechanism of initial data preprocessing using tokenization are described.The concept of text pattern recognition is implemented using artificial intelligence methods.For creation of the neural network model of textual geological and geophysical information recognition the Python programming language is used in combination with the convolutional neural network technologies for text classification (TextCNN), bi-directional long-shortterm memory networks (BiLSTM) and bi-directional coder representation networks (BERT).The stack of these technologies and the Python programming language, after developing and testing the basic version of the neural network model of qualitative information recognition, provided an acceptable level of performance of the algorithm of digital transformation of text data.The best result (the current version of neural network model is 1.0; more than 3000 examples for training and testing) was achieved when using the algorithm of text data recognition based on BERT with an accuracy on the validation network (Validation Accuracy) ~0.830173 (25th epoch), with Validation Loss ~0.244719, with Training Loss ~0.000984 and probability of recognition of the studied rock lithotypes more than 95 %.The mechanisms of code modification for further improvement of textual prediction accuracy based on the created neural network were determined.","PeriodicalId":240239,"journal":{"name":"Oil and Gas Studies","volume":"184 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Oil and Gas Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31660/0445-0108-2023-2-35-54","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The algorithm of analog-to-digital conversion of primary geological and geophysical information (on the example of identification of rock lithotypes based on the text description of the physical core) is presented.As part of the work, a combination of three types of scientific research - prospecting, interdisciplinary and applied, in the formation of the initial base of qualitative data is implemented.Common algorithms for textual information classification and mechanism of initial data preprocessing using tokenization are described.The concept of text pattern recognition is implemented using artificial intelligence methods.For creation of the neural network model of textual geological and geophysical information recognition the Python programming language is used in combination with the convolutional neural network technologies for text classification (TextCNN), bi-directional long-shortterm memory networks (BiLSTM) and bi-directional coder representation networks (BERT).The stack of these technologies and the Python programming language, after developing and testing the basic version of the neural network model of qualitative information recognition, provided an acceptable level of performance of the algorithm of digital transformation of text data.The best result (the current version of neural network model is 1.0; more than 3000 examples for training and testing) was achieved when using the algorithm of text data recognition based on BERT with an accuracy on the validation network (Validation Accuracy) ~0.830173 (25th epoch), with Validation Loss ~0.244719, with Training Loss ~0.000984 and probability of recognition of the studied rock lithotypes more than 95 %.The mechanisms of code modification for further improvement of textual prediction accuracy based on the created neural network were determined.
数字核心:神经网络识别文本地质和地球物理信息
介绍了原始地质和地球物理信息的模数转换算法(以根据物理岩心的文字描述识别岩石岩性为例)。作为工作的一部分,结合三种类型的科学研究--勘探、跨学科和应用--形成定性数据的初始基础。为了创建地质和地球物理文本信息识别的神经网络模型,Python 编程语言与用于文本分类的卷积神经网络技术(TextCNN)、双向长短期记忆网络(BiLSTM)和双向编码器表示网络(BERT)结合使用。在开发和测试了定性信息识别神经网络模型的基本版本后,这些技术和 Python 编程语言的堆栈为文本数据数字化转换算法提供了可接受的性能水平。在使用基于 BERT 的文本数据识别算法时取得了最佳结果(当前版本的神经网络模型为 1.0;用于训练和测试的示例超过 3000 个),验证网络的准确率(验证准确率)~0.830173(第 25 个纪元),验证损失~0.244719,训练损失~0.000984,所研究岩石类型的识别概率超过 95%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信