{"title":"利用自然语言处理情感分析技术实现钻井报告数据挖掘工具","authors":"P. Kowalchuk","doi":"10.2118/194961-MS","DOIUrl":null,"url":null,"abstract":"\n Drilling operations generate much information, such as daily drilling reports and reports generated by service companies, support personnel, and other stakeholders. These reports can be unstructured with information presented in a variety of formats. The extraction of this information is frequently challenging, which limits its use in future projects. Natural language processing provides an efficient way of mining and obtaining knowledge. This paper demonstrates how these techniques were used to analyze vast amounts of historical documents to quickly rank well complexity and determine which aspects of drilling operations were most critical.\n Sentiment analysis can be used to classify documents and other pieces of information into separate categories. In social media, it is used to analyze the collective perception of a given trending item. The technique was used here to classify wells into two ranked categorized lists. First, a classification listed wells by drilling issues. Second, a complexity ranking was defined so that each well could be classified as easy or difficult to drill. To build the sentiment analysis tool, a random set of training wells and their respective documents were selected. From these documents, a list of words was identified in what became known as highlighting sessions. During these sessions, subject matter experts (SMEs) classified words found in the documents. This \"bag of words\" was then used to train a classifier capable of ranking the wells related to the documents. A probability was associated to each well, providing a likelihood of inclusion in a given category.\n The methodology proved to be successful, ranking drilling documents in both defined category sets. Results show that the list of ranked wells can be used by SMEs to identify which wells are relevant and deserve detailed analysis. The list generated for both categories provided a guideline for further analysis, particularly identifying wells with little value. Results also showed the importance of correctly developing a list of words, an adequate training set, and the language used, as well as the need for SMEs to produce the final analysis. The technology showed promising results with real-world applications being conceivable with its current level of maturity. However, the results also indicated room for improving its effectiveness by refining the highlighting sessions, word lists, types of classifier used, and final ranking methodology.\n The use of methods and technology to help improve and enable the analysis of unstructured data in the drilling space should increase over time. This paper shows how current technology can already be used in practical real-life cases to produce tangible value.","PeriodicalId":10908,"journal":{"name":"Day 2 Tue, March 19, 2019","volume":"79 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Implementing a Drilling Reporting Data Mining Tool Using Natural Language Processing Sentiment Analysis Techniques\",\"authors\":\"P. Kowalchuk\",\"doi\":\"10.2118/194961-MS\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Drilling operations generate much information, such as daily drilling reports and reports generated by service companies, support personnel, and other stakeholders. These reports can be unstructured with information presented in a variety of formats. The extraction of this information is frequently challenging, which limits its use in future projects. Natural language processing provides an efficient way of mining and obtaining knowledge. This paper demonstrates how these techniques were used to analyze vast amounts of historical documents to quickly rank well complexity and determine which aspects of drilling operations were most critical.\\n Sentiment analysis can be used to classify documents and other pieces of information into separate categories. In social media, it is used to analyze the collective perception of a given trending item. The technique was used here to classify wells into two ranked categorized lists. First, a classification listed wells by drilling issues. Second, a complexity ranking was defined so that each well could be classified as easy or difficult to drill. To build the sentiment analysis tool, a random set of training wells and their respective documents were selected. From these documents, a list of words was identified in what became known as highlighting sessions. During these sessions, subject matter experts (SMEs) classified words found in the documents. This \\\"bag of words\\\" was then used to train a classifier capable of ranking the wells related to the documents. A probability was associated to each well, providing a likelihood of inclusion in a given category.\\n The methodology proved to be successful, ranking drilling documents in both defined category sets. Results show that the list of ranked wells can be used by SMEs to identify which wells are relevant and deserve detailed analysis. The list generated for both categories provided a guideline for further analysis, particularly identifying wells with little value. Results also showed the importance of correctly developing a list of words, an adequate training set, and the language used, as well as the need for SMEs to produce the final analysis. The technology showed promising results with real-world applications being conceivable with its current level of maturity. However, the results also indicated room for improving its effectiveness by refining the highlighting sessions, word lists, types of classifier used, and final ranking methodology.\\n The use of methods and technology to help improve and enable the analysis of unstructured data in the drilling space should increase over time. This paper shows how current technology can already be used in practical real-life cases to produce tangible value.\",\"PeriodicalId\":10908,\"journal\":{\"name\":\"Day 2 Tue, March 19, 2019\",\"volume\":\"79 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Day 2 Tue, March 19, 2019\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2118/194961-MS\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 2 Tue, March 19, 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/194961-MS","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Implementing a Drilling Reporting Data Mining Tool Using Natural Language Processing Sentiment Analysis Techniques
Drilling operations generate much information, such as daily drilling reports and reports generated by service companies, support personnel, and other stakeholders. These reports can be unstructured with information presented in a variety of formats. The extraction of this information is frequently challenging, which limits its use in future projects. Natural language processing provides an efficient way of mining and obtaining knowledge. This paper demonstrates how these techniques were used to analyze vast amounts of historical documents to quickly rank well complexity and determine which aspects of drilling operations were most critical.
Sentiment analysis can be used to classify documents and other pieces of information into separate categories. In social media, it is used to analyze the collective perception of a given trending item. The technique was used here to classify wells into two ranked categorized lists. First, a classification listed wells by drilling issues. Second, a complexity ranking was defined so that each well could be classified as easy or difficult to drill. To build the sentiment analysis tool, a random set of training wells and their respective documents were selected. From these documents, a list of words was identified in what became known as highlighting sessions. During these sessions, subject matter experts (SMEs) classified words found in the documents. This "bag of words" was then used to train a classifier capable of ranking the wells related to the documents. A probability was associated to each well, providing a likelihood of inclusion in a given category.
The methodology proved to be successful, ranking drilling documents in both defined category sets. Results show that the list of ranked wells can be used by SMEs to identify which wells are relevant and deserve detailed analysis. The list generated for both categories provided a guideline for further analysis, particularly identifying wells with little value. Results also showed the importance of correctly developing a list of words, an adequate training set, and the language used, as well as the need for SMEs to produce the final analysis. The technology showed promising results with real-world applications being conceivable with its current level of maturity. However, the results also indicated room for improving its effectiveness by refining the highlighting sessions, word lists, types of classifier used, and final ranking methodology.
The use of methods and technology to help improve and enable the analysis of unstructured data in the drilling space should increase over time. This paper shows how current technology can already be used in practical real-life cases to produce tangible value.