{"title":"Geoweaver_cwl: Transforming geoweaver AI workflows to common workflow language to extend interoperability","authors":"Amruta Kale , Ziheng Sun , Chao Fan , Xiaogang Ma","doi":"10.1016/j.acags.2023.100126","DOIUrl":"10.1016/j.acags.2023.100126","url":null,"abstract":"<div><p>Recently, workflow management platforms are gaining more attention in the artificial intelligence (AI) community. Traditionally, researchers self-managed their workflows in a manual and tedious way that heavily relies on their memory. Due to the complexity and unpredictability of AI models, they often struggled to track and manage all the data, steps, and history of the workflow. AI workflows are time-consuming, redundant, and error-prone, especially when big data is involved. A common strategy to make these workflows more manageable is to use a workflow management system, and we recommend Geoweaver, an open-source workflow management system that enables users to create, modify and reuse AI workflows all in one place. To make our work in Geoweaver reusable by the other workflow management systems, we created an add-on functionality <strong><em>geoweaver_cwl</em></strong>, a Python package that automatically converts Geoweaver AI workflows into the Common Workflow Language (CWL) format. It will allow researchers to easily share, exchange, modify, reuse, and build a new workflow from existing ones in other CWL-compliant software. A user study was conducted with the existing workflows created by Geoweaver to collect suggestions and fill in the gaps between our package and Geoweaver. The evaluation confirms that <strong><em>geoweaver_cwl</em></strong> can lead to a well-versed AI process while disclosing opportunities for further extensions. The <strong><em>geoweaver_cwl</em></strong> package is publicly released online at <span>https://pypi.org/project/geoweaver-cwl/0.0.1/</span><svg><path></path></svg>.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"19 ","pages":"Article 100126"},"PeriodicalIF":3.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46784253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A practical approach for discriminating tectonic settings of basaltic rocks using machine learning","authors":"Kentaro Nakamura","doi":"10.1016/j.acags.2023.100132","DOIUrl":"10.1016/j.acags.2023.100132","url":null,"abstract":"<div><p>Elucidating the tectonic setting of unknown rock samples has long attracted the interest of not only igneous petrologists but also a wide range of geoscientists. Recently, attempts have been made to use machine learning to discriminate the tectonic setting of igneous rocks. However, few studies have designed methods that are applicable to altered rocks. This study proposes a novel approach that utilizes the ratio of elements less susceptible to weathering, alteration, and metamorphism as feature values for analyzing altered basalts. The method was evaluated using six well-established machine learning algorithms: K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Random Forest (RF), Light Gradient Boosting Machine (LightGBM), eXtreme Gradient Boosting (XGBoost), and Multi-Layer Perceptron (MLP). The results show that KNN achieved the highest classification score of 83.9% in the balanced accuracy of classifying the eight tectonic settings, closely followed by SVM with a score of 83.7%. In addition, oceanic and arc/continental basalts could also be discriminated against with an accuracy of more than ∼90% for KNN. This study suggested that the machine learning method can discriminate tectonic settings more accurately and reliably than previously used discrimination diagrams by designing appropriate feature values.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"19 ","pages":"Article 100132"},"PeriodicalIF":3.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46128134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GeoSim: An R-package for plurigaussian simulation and Co-simulation between categorical and continuous variables","authors":"George Valakas, Konstantinos Modis","doi":"10.1016/j.acags.2023.100130","DOIUrl":"https://doi.org/10.1016/j.acags.2023.100130","url":null,"abstract":"<div><p>Plurigaussian simulation is widely used to model geological facies in geosciences and is predominantly applied in mineral deposits and petroleum reservoirs exploration. GeoSim package builds geostatistical models of categorical regionalized variables via conditional or unconditional Plurigaussian simulation and co-simulation. Co-simulation between Gaussian Random Fields representing the geological facies and other numerical variables accounting for auxiliary hydrological or geophysical quantities, is also available in this package with the definition of a linear coregionalization model. The algorithm is not restricted by the number of simulated facies and the number of truncated Gaussians, while parts of the code requiring heavy computations are compiled in C++ taking benefits of the integration between R and C++. In this work, we introduce the GeoSim package and demonstrate its capabilities. We present a 3D application focused on a lignite mine in Greece, where we investigate the Plurigaussian simulation and co-simulation of five geological facies (categorical variables) and the lower calorific value (continuous variable). The findings of our study highlight the significant benefits of Plurigaussian and co-simulation to capture the geological spatial heterogeneity.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"19 ","pages":"Article 100130"},"PeriodicalIF":3.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49727523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"3D hydrostratigraphic and hydraulic conductivity modelling using supervised machine learning","authors":"Tewodros Tilahun , Jesse Korus","doi":"10.1016/j.acags.2023.100122","DOIUrl":"10.1016/j.acags.2023.100122","url":null,"abstract":"<div><p>Accurately modeling highly heterogenous aquifers is one of the big challenges in hydrogeology. There is a pressing need to develop new methods that transform high-resolution data into hydrogeological parameters representative of such aquifers. We use random forest-based machine learning to predict the distribution of hydrostratigraphic units and hydraulic conductivity (K) at a regional scale. We used lithologic logs from >2000 boreholes and resistivity-depth models from 2717 km of Airborne Electromagnetics (AEM). Eighty unique lithologic categories are lumped into 5 hydrostratigraphic units. K data is derived from descriptions of grain size and texture. The input data are resampled into a 200 × 200 × 1m grid and split into 70% training and 30% validation. K prediction had a training F1 score of 95% and 87% testing accuracy. After hyperparameter tuning these scores improved to 99.6% and 92%, respectively. Hydrostratigraphic unit prediction showed a training F1 score of 97% and 91% testing accuracy, improving to 100% and 95% after hyperparameter tuning. This method produces a high-resolution 3D model of K and hydrostratigraphic units that fills gaps between widely spaced boreholes. It is applicable in any setting where boreholes and AEM are available and can be used to build robust groundwater models for heterogeneous aquifers.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"19 ","pages":"Article 100122"},"PeriodicalIF":3.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49498133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Uncertainties in 3-D stochastic geological modeling of fictive grain size distributions in detrital systems","authors":"Alberto Albarrán-Ordás , Kai Zosseder","doi":"10.1016/j.acags.2023.100127","DOIUrl":"https://doi.org/10.1016/j.acags.2023.100127","url":null,"abstract":"<div><p>Geological 3-D models are very useful tools to predict subsurface properties. However, they are always subject to uncertainties, starting from the primary data. To ensure the reliability of the model outputs and, thus, to support the decision-making process, the incorporation and quantification of uncertainties have to be integrated into the geo-modeling strategies. Among all modeling approaches, the novel <em>D</em><sub><em>i</em></sub> models method was conceived as a stochastic approach to make predictions of the 3-D lithological composition of detrital systems, based on estimating the fictive grain size distribution of the sediment mixture by using soil observations from drilled materials. Within the present study, we aim to adapt the geo-modeling framework of this method in order to incorporate uncertainties linked to systematic imprecisions in the soil observations used as input data. Following this, uncertainty quantification measures are proposed, based on entropy and joint entropy for the main outcomes of the method, i.e., the partial percentile lithological models, and for the whole sediment mixture. Both the ability of the uncertainty quantification measures and the uncertainty propagation derived from the extension of the method are investigated in the model outcomes in a simulation experiment with real data conducted in a small-scale domain located in Munich (Germany). The results show that this adaptation of the <em>D</em><sub><em>i</em></sub> models method overcomes potential bias caused by ignoring imprecise input data, thus providing a more realistic assessment of uncertainty. The uncertainty measures provide very useful insight for quantifying local uncertainties, comparing between average uncertainties and for better understanding how the implementation parameters of the geo-modeling process influence the property estimation and the underlying uncertainties. The main findings of the present study have great potential for providing robust uncertainty information about model outputs, which ultimately strengthens the decision-making process for practical applications based on the implementation of the <em>D</em><sub><em>i</em></sub> models method.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"19 ","pages":"Article 100127"},"PeriodicalIF":3.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49759763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Super-resolution in thin section of lacustrine shale reservoirs and its application in mineral and pore segmentation","authors":"Chao Guo, Chao Gao, Chao Liu, Gang Liu, Jianbo Sun, Yiyi Chen, Chendong Gao","doi":"10.1016/j.acags.2023.100133","DOIUrl":"10.1016/j.acags.2023.100133","url":null,"abstract":"<div><p>Lacustrine shale reservoirs present intricate attributes such as the prevalence of lamination, rapid sedimentary phase transitions, and pronounced heterogeneity. These factors introduce substantial challenges in analyzing and comprehending reservoir characteristics. Thin-section imaging offers a direct medium to observe these traits, yet the intrinsic compromise between image resolution and field of view impedes the concurrent capture of comprehensive details and contextual overview. This study delves into the application of super-resolution (SR) techniques to augment the segmentation of thin-section images from lacustrine shale, an unconventional reservoir. SR application furnishes high-resolution images, facilitating a robust analysis of morphology, texture, edge properties, and target classification. Utilizing data from the lacustrine shale reservoir of the Ordos Basin, we evaluate our methodology and assess the impact of SR enhancement on segmentation. Quantitative results indicate significant improvements, with the Jaccard index for shale increasing from 0.4790 (Low-Resolution) to 0.7803 (ESRGAN) in the Y channel of the YCrCb color space after level set segmentation, exemplifying the efficacy of SR in shale gas and oil reservoirs. This research underscores the necessity to consider lacustrine shale's unique features while formulating and implementing SR techniques for improved information extraction. Furthermore, it highlights SR's potential for propelling future research and industry-specific applications.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"19 ","pages":"Article 100133"},"PeriodicalIF":3.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43641102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martin Mushayandebvu , Aaron DesRoches , Martin Bates , Andy Parmenter , Derek Kouhi
{"title":"Subsurface geometry of the Revell Batholith by constrained geophysical modelling, NW Ontario, Canada","authors":"Martin Mushayandebvu , Aaron DesRoches , Martin Bates , Andy Parmenter , Derek Kouhi","doi":"10.1016/j.acags.2023.100121","DOIUrl":"10.1016/j.acags.2023.100121","url":null,"abstract":"<div><p>The Revell batholith is located within the Western Wabigoon terrane of the Superior Province, Northwestern Ontario, Canada, and is a potential site for a deep geological repository (DGR). This batholith is considered to have favourable geoscientific characteristics for hosting a DGR, including a sufficient volume of relatively homogenous rock. The subsurface geometry of the batholith plays an important role in determining its volume, as well as assessing regional-scale hydraulics, rock mechanics, and glacial stress disturbances on the bedrock, which are other important features and processes that can impact the batholith over the timeframes of concern for long-term storage of used nuclear fuel. Subsurface geometry is complicated to unravel, and surface mapping alone is inadequate to obtain the information at depth. However, gravity, magnetic, or seismic data can be used to enhance understanding by approximating the geometry.</p><p>This study aims to refine the subsurface geometry and distribution of the Revell batholith from a constrained forward and inverse geophysical model, incorporating high-resolution geophysical data together with a compilation of historic and recent geological field data. The Revell batholith was previously cited as a flat-based pluton with a depth of 1.6 km, where our findings suggest the batholith is deeper than previously thought, with an uneven contact geometry at its base that extends slightly deeper than 3.5 km. Model uncertainties were assessed by varying probabilistic constraints on volume overlap/commonality and shape within GeoModeller™. Results indicate that overall batholith-greenstone contact is generally unchanged when the geological constraints are varied, providing a high degree of confidence that the Revell batholith has a sufficient volume of relatively homogeneous bedrock.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"19 ","pages":"Article 100121"},"PeriodicalIF":3.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41463873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Janaína Anjos Melo, Carlos Alberto Mendonça, Yara Regina Marangoni
{"title":"Python programs to apply regularized derivatives in the magnetic tilt derivative and gradient intensity data processing: A graphical procedure to choose the regularization parameter","authors":"Janaína Anjos Melo, Carlos Alberto Mendonça, Yara Regina Marangoni","doi":"10.1016/j.acags.2023.100129","DOIUrl":"10.1016/j.acags.2023.100129","url":null,"abstract":"<div><p>The Tikhonov regularization parameter is a key parameter controlling the smoothness degree and oscillations of a regularized unknown solution. Usual methods to determine a proper parameter (L-curve or the discrepancy principle, for example) are not readily applicable to the evaluation of regularized derivatives, since this formulation does not make explicit a set of model parameters that are necessary to implement these methods. We develop a procedure for the determination of the regularization parameter based on the graphical construction of a characteristic “staircase” function associated with the <span><math><mrow><msub><mi>L</mi><mn>2</mn></msub></mrow></math></span>-norm of the regularized derivatives for a set of trial regularization parameters. This function is independent of model parameters and presents a smooth and monotonic variation. The regularization parameters at the upper step (low values) of the ''staircase'' function provide equivalent results to the non-regularized derivative, the parameters at the lower step (high values) leading to over-smoothed derivatives. For the evaluated data sets, the proper regularization parameter is located in the slope connecting these two flat end-members of the staircase curve, thus balancing noise amplification against the amplitude loss in the transformed fields. A set of Python programs are presented to evaluate the regularization procedure in a well-known synthetic model composed of multiple (bulk and elongated) magnetic sources. This numerical approach also is applied in gridded aeromagnetic data covering high-grade metamorphic terrains of the Anápolis-Itauçu Complex in the Brasília Fold Belt central portion of Tocantins Province, central Brazil, characterized by multiple magnetic lineaments with different directions and intersections which are associated with shear zones, geologic faults, and intrusive bodies. The results obtained from the regularization procedure show efficiency in improving the maps of filtered fields, better tracking the continuity of magnetic lineaments and general geological trends. The results from the application in the Brasília Fold Belt enhance the importance and broader coverage of the Pirineus Zone of High Strain.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"19 ","pages":"Article 100129"},"PeriodicalIF":3.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47634978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Near surface sediments introduce low frequency noise into gravity models","authors":"G.A. Phelps, C. Cronkite-Ratcliff","doi":"10.1016/j.acags.2023.100131","DOIUrl":"10.1016/j.acags.2023.100131","url":null,"abstract":"<div><p>3D geologic modeling and mapping often relies on gravity modeling to identify key geologic structures, such as basin depth, fault offset, or fault dip. Such gravity models generally assume either homogeneous or spatially uncorrelated densities within modeled rock bodies and overlying sediments, with average densities typically derived from surface and drill-hole sampling. The noise contributed to the gravity anomaly by these density assumptions is zero in the homogeneous case and typically <200 μGal in the uncorrelated case. Rock bodies and sediments, however, show both a range of densities and spatial correlation of these densities, in both surface and drill-hole samples, and this correlation causes an increase in power in the low frequency content of the resulting gravity anomaly. Spatial correlation of densities can be modeled as a Gaussian random field (GRF), with the random field parameters derived from drill-hole and geologic map data. Data from alluvial fan sediments in southern Nevada indicate correlation lengths of up to 300 m in the vertical dimension and kilometers in the horizontal dimension. The resulting GRF density models show that the noise contributed to the measured gravity anomaly is of low frequency and can be several mGal in amplitude, contradicting the common attribution of lower frequencies to deeper sources. This low-frequency noise increases in power with an increase in sediment thickness. Its presence increases the ambiguity of interpretations of subsurface geologic body shape, such as basin analyses that attempt to quantify concealed basement fault depths, offsets, and dip angles. In the southwestern United States, where basin analyses are important for natural resource applications, such ambiguity increases the uncertainty of subsequent process modeling.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"19 ","pages":"Article 100131"},"PeriodicalIF":3.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46261589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeffrey Obelcz , Trilby Hill , Davin J. Wallace , Benjamin J. Phrampus , Jordan H. Graw
{"title":"A machine learning approach using legacy geophysical datasets to model Quaternary marine paleotopography","authors":"Jeffrey Obelcz , Trilby Hill , Davin J. Wallace , Benjamin J. Phrampus , Jordan H. Graw","doi":"10.1016/j.acags.2023.100128","DOIUrl":"10.1016/j.acags.2023.100128","url":null,"abstract":"<div><p>High-resolution subsurface marine mapping tools, including chirp and 3D seismic, enable the reconstruction of ancient landscapes that have been buried and subsequently submerged by marine transgression. However, the established methods for paleotopographic reconstruction require time consuming field and data interpretation efforts. Here we present a novel methodology using machine learning to estimate Marine Isotope Stage 2 (MIS2) paleotopography over a large (22 000 km<sup>2</sup>) area of the Northern Gulf of Mexico with meter-scale accuracy (2.7 m mean prediction error, 4.3 m 1-σ mean uncertainty). A relatively small area (3300 km<sup>2</sup>) of high-resolution (30 × 30 m) interpreted paleotopography is used as training and validation data, while modern bathymetry and MIS2 paleovalley location (binary deep/shallow paleotopography) are used as predictors. This approach merges the high-resolution of modern mapping techniques and the broad coverage of low-resolution legacy geophysical data. Machine learning-modeled paleotopography is not a substitute for precise high-resolution paleotopography reconstruction techniques, but it can be used to reasonably approximate paleotopography over large areas with greatly reduced expense and expertise.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"19 ","pages":"Article 100128"},"PeriodicalIF":3.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47334666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}