Luciana Maria da Silva, Leandro M. Ferreira, G. Avansi, D. Schiozer, S. N. Alves-Souza
{"title":"Selection of a Dimensionality Reduction Method: An Application to Deal with High-Dimensional Geostatistical Realizations in Oil Reservoirs","authors":"Luciana Maria da Silva, Leandro M. Ferreira, G. Avansi, D. Schiozer, S. N. Alves-Souza","doi":"10.2118/212299-pa","DOIUrl":null,"url":null,"abstract":"\n One of the challenges related to reservoir engineering studies is working with essential high-dimensional inputs, such as porosity and permeability, which govern fluid flow in porous media. Dimensionality reduction (DR) methods have enabled spatial variability in constructing a fast objective function estimator (FOFE). This study presents a methodology to select an adequate DR method to deal with high-dimensional spatial attributes with more than 105 dimensions. We investigated 18 methods of DR commonly applied in the literature. The proposed workflow accomplished (1) definition of the adequate number of dimensions; (2) evaluation of the time spent for each data set generated using the elapsed computational time; (3) training using the automated machine learning (AutoML) technique; (4) validation using the root mean square logarithmic error (RMSLE) and the confidence interval (CI) of 95%; (5) a score equation using elapsed computational time and RMSLE; and (6) consistency check to evaluate if the FOFE is reliable to mimic simulator output. We used FOFE to generate risk curves at the final forecast period (10,957 days) as an application. We obtained methods that reduced the high-dimensional spatial attributes with a computational time lower than 10 minutes, enabling us to consider them in the FOFE building. We could deal with high-dimensional spatial variability from those selected approaches. Moreover, we can use the DR method selected to deal with high complexity problems to build an FOFE and avoid overfitting when a massive number of data are used.","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.2118/212299-pa","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 3
Abstract
One of the challenges related to reservoir engineering studies is working with essential high-dimensional inputs, such as porosity and permeability, which govern fluid flow in porous media. Dimensionality reduction (DR) methods have enabled spatial variability in constructing a fast objective function estimator (FOFE). This study presents a methodology to select an adequate DR method to deal with high-dimensional spatial attributes with more than 105 dimensions. We investigated 18 methods of DR commonly applied in the literature. The proposed workflow accomplished (1) definition of the adequate number of dimensions; (2) evaluation of the time spent for each data set generated using the elapsed computational time; (3) training using the automated machine learning (AutoML) technique; (4) validation using the root mean square logarithmic error (RMSLE) and the confidence interval (CI) of 95%; (5) a score equation using elapsed computational time and RMSLE; and (6) consistency check to evaluate if the FOFE is reliable to mimic simulator output. We used FOFE to generate risk curves at the final forecast period (10,957 days) as an application. We obtained methods that reduced the high-dimensional spatial attributes with a computational time lower than 10 minutes, enabling us to consider them in the FOFE building. We could deal with high-dimensional spatial variability from those selected approaches. Moreover, we can use the DR method selected to deal with high complexity problems to build an FOFE and avoid overfitting when a massive number of data are used.