地质认知技术在盆地与油气系统分析中的应用

Day 1 Mon, November 11, 2019 Pub Date : 2019-11-11 DOI:10.2118/197610-ms

P. Ruffo, M. Piantanida, Floriana Bergero, P. Staar, C. Bekas

{"title":"地质认知技术在盆地与油气系统分析中的应用","authors":"P. Ruffo, M. Piantanida, Floriana Bergero, P. Staar, C. Bekas","doi":"10.2118/197610-ms","DOIUrl":null,"url":null,"abstract":"\n \n \n When dealing with new exploration areas, basin geologists face the challenge of collecting relevant information from all available sources. This include a number of structured commercial databases, but also large corpora of technical documents in which an invaluable amount of information is scattered across. Even if assisted by search tools to filter the documents of interest, extracting information requires a human effort in reading and understanding the documents.\n \n \n \n Eni and IBM developed a cognitive engine exploiting a deep learning approach to scan documents searching for basin geology concepts, extracting information about petroleum system elements (e.g. formation name, geological age and lithology of source rocks, reservoirs and seals) and enabling basin geologists to perform automated queries to collect all the information related to a basin of interest. The collected information is fully referenced to the original paragraphs, tables or pictures of the document in which it was discovered, therefore enabling to validate the robustness of the results.\n \n \n \n The cognitive engine has been integrated within an application which enables to build a graphical representation of the Petroleum System Event Charts of the basin, integrating the information extracted from commercial databases, the results from the cognitive engine and the manual input from the geologist. The quality of the results from the cognitive engine has been evaluated using a commercial database which provides both tabular data about basins and detailed pdf reports. The cognitive engine has been trained on the pdf reports alone, and the results have been compared with the tabular content of the database, representing the ground truth. The cognitive engine succeeded in identifying the right formations, lithologies and geological ages of the petroleum systems with an accuracy in the range 75% – 90%.\n \n \n \n The cognitive engine is built with highly innovative technologies, combining the data driven capabilities of deep neural networks with more traditional natural language processing methods based on ontologies. Documents are processed with a three-step approach. In the first step, convolutional neural networks (CNN) are used to recognize the structural elements within a technical paper (e.g. title, authors, paragraphs, figures, tables, references) and to convert a complex pdf structure into a clean sequence of text, which can be analyzed. In the second step, concepts are extracted from these processed documents using extractors, NLP annotators (based on recurrent neural networks) and aggregators. Finally, the joint use of the results from the deep learning tools and the provided ontologies are used to build a knowledge graph, which links together all the discovered entities and their relationships. A fit-for-purpose high efficient graph database has been developed so that the graph can be traversed with full flexibility, collecting all the concepts needed for basin geology studies.\n","PeriodicalId":11061,"journal":{"name":"Day 1 Mon, November 11, 2019","volume":"46 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Application of Geocognitive Technologies to Basin & Petroleum System Analyses\",\"authors\":\"P. Ruffo, M. Piantanida, Floriana Bergero, P. Staar, C. Bekas\",\"doi\":\"10.2118/197610-ms\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n \\n \\n When dealing with new exploration areas, basin geologists face the challenge of collecting relevant information from all available sources. This include a number of structured commercial databases, but also large corpora of technical documents in which an invaluable amount of information is scattered across. Even if assisted by search tools to filter the documents of interest, extracting information requires a human effort in reading and understanding the documents.\\n \\n \\n \\n Eni and IBM developed a cognitive engine exploiting a deep learning approach to scan documents searching for basin geology concepts, extracting information about petroleum system elements (e.g. formation name, geological age and lithology of source rocks, reservoirs and seals) and enabling basin geologists to perform automated queries to collect all the information related to a basin of interest. The collected information is fully referenced to the original paragraphs, tables or pictures of the document in which it was discovered, therefore enabling to validate the robustness of the results.\\n \\n \\n \\n The cognitive engine has been integrated within an application which enables to build a graphical representation of the Petroleum System Event Charts of the basin, integrating the information extracted from commercial databases, the results from the cognitive engine and the manual input from the geologist. The quality of the results from the cognitive engine has been evaluated using a commercial database which provides both tabular data about basins and detailed pdf reports. The cognitive engine has been trained on the pdf reports alone, and the results have been compared with the tabular content of the database, representing the ground truth. The cognitive engine succeeded in identifying the right formations, lithologies and geological ages of the petroleum systems with an accuracy in the range 75% – 90%.\\n \\n \\n \\n The cognitive engine is built with highly innovative technologies, combining the data driven capabilities of deep neural networks with more traditional natural language processing methods based on ontologies. Documents are processed with a three-step approach. In the first step, convolutional neural networks (CNN) are used to recognize the structural elements within a technical paper (e.g. title, authors, paragraphs, figures, tables, references) and to convert a complex pdf structure into a clean sequence of text, which can be analyzed. In the second step, concepts are extracted from these processed documents using extractors, NLP annotators (based on recurrent neural networks) and aggregators. Finally, the joint use of the results from the deep learning tools and the provided ontologies are used to build a knowledge graph, which links together all the discovered entities and their relationships. A fit-for-purpose high efficient graph database has been developed so that the graph can be traversed with full flexibility, collecting all the concepts needed for basin geology studies.\\n\",\"PeriodicalId\":11061,\"journal\":{\"name\":\"Day 1 Mon, November 11, 2019\",\"volume\":\"46 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Day 1 Mon, November 11, 2019\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2118/197610-ms\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 1 Mon, November 11, 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/197610-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在处理新的勘探区域时，盆地地质学家面临着从所有可用来源收集相关信息的挑战。这包括许多结构化的商业数据库，但也包括技术文档的大型语料库，其中分散着大量宝贵的信息。即使在搜索工具的帮助下过滤感兴趣的文档，提取信息也需要人工阅读和理解文档。Eni和IBM开发了一种认知引擎，利用深度学习方法扫描文档，搜索盆地地质概念，提取有关石油系统元素的信息(例如地层名称、地质年龄和烃源岩、储层和密封的岩性)，使盆地地质学家能够执行自动查询，收集与感兴趣的盆地相关的所有信息。收集到的信息完全引用了发现它的文件的原始段落、表格或图片，因此能够验证结果的稳健性。认知引擎已经集成到一个应用程序中，该应用程序能够构建盆地石油系统事件图的图形表示，集成了从商业数据库提取的信息、认知引擎的结果和地质学家的手动输入。使用商业数据库对认知引擎的结果质量进行了评估，该数据库提供了有关盆地的表格数据和详细的pdf报告。认知引擎仅在pdf报告上进行了训练，并将结果与数据库的表格内容进行了比较，代表了基本事实。认知引擎成功地识别了油气系统的正确地层、岩性和地质年龄，准确率在75% - 90%之间。认知引擎采用高度创新的技术，将深度神经网络的数据驱动能力与基于本体的更传统的自然语言处理方法相结合。文档的处理分为三个步骤。第一步，使用卷积神经网络(CNN)来识别技术论文中的结构元素(例如标题、作者、段落、图表、表格、参考文献)，并将复杂的pdf结构转换为干净的文本序列，以便进行分析。在第二步中，使用提取器、NLP注释器(基于循环神经网络)和聚合器从这些处理过的文档中提取概念。最后，利用深度学习工具的结果和提供的本体来构建一个知识图，该知识图将所有发现的实体及其关系联系在一起。开发了一个适合目的的高效图形数据库，以便图形可以完全灵活地遍历，收集盆地地质研究所需的所有概念。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Application of Geocognitive Technologies to Basin & Petroleum System Analyses

When dealing with new exploration areas, basin geologists face the challenge of collecting relevant information from all available sources. This include a number of structured commercial databases, but also large corpora of technical documents in which an invaluable amount of information is scattered across. Even if assisted by search tools to filter the documents of interest, extracting information requires a human effort in reading and understanding the documents. Eni and IBM developed a cognitive engine exploiting a deep learning approach to scan documents searching for basin geology concepts, extracting information about petroleum system elements (e.g. formation name, geological age and lithology of source rocks, reservoirs and seals) and enabling basin geologists to perform automated queries to collect all the information related to a basin of interest. The collected information is fully referenced to the original paragraphs, tables or pictures of the document in which it was discovered, therefore enabling to validate the robustness of the results. The cognitive engine has been integrated within an application which enables to build a graphical representation of the Petroleum System Event Charts of the basin, integrating the information extracted from commercial databases, the results from the cognitive engine and the manual input from the geologist. The quality of the results from the cognitive engine has been evaluated using a commercial database which provides both tabular data about basins and detailed pdf reports. The cognitive engine has been trained on the pdf reports alone, and the results have been compared with the tabular content of the database, representing the ground truth. The cognitive engine succeeded in identifying the right formations, lithologies and geological ages of the petroleum systems with an accuracy in the range 75% – 90%. The cognitive engine is built with highly innovative technologies, combining the data driven capabilities of deep neural networks with more traditional natural language processing methods based on ontologies. Documents are processed with a three-step approach. In the first step, convolutional neural networks (CNN) are used to recognize the structural elements within a technical paper (e.g. title, authors, paragraphs, figures, tables, references) and to convert a complex pdf structure into a clean sequence of text, which can be analyzed. In the second step, concepts are extracted from these processed documents using extractors, NLP annotators (based on recurrent neural networks) and aggregators. Finally, the joint use of the results from the deep learning tools and the provided ontologies are used to build a knowledge graph, which links together all the discovered entities and their relationships. A fit-for-purpose high efficient graph database has been developed so that the graph can be traversed with full flexibility, collecting all the concepts needed for basin geology studies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Day 1 Mon, November 11, 2019

自引率

0.00%

发文量