{"title":"Sinkhole susceptibility analysis using machine learning for west central Florida","authors":"Olanrewaju Muili, Hassan A. Babaie","doi":"10.1016/j.acags.2025.100262","DOIUrl":"10.1016/j.acags.2025.100262","url":null,"abstract":"<div><div>This study examined the feasibility and accuracy of applying machine learning for sinkhole classification and prediction and using the results in automated sinkhole susceptibility mapping for west central Florida. A two-stage processing pipeline was developed. In the first stage, we assessed the predictive power of five exemplary machine learning algorithms: random forest (RF), logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), and multilayer perceptron (MLP), and select the best-performing model. The top-performed model was then chosen to develop a sinkhole susceptibility map (SSM) in the second step of the process. Nine feature layers were derived from the collected geospatial data and utilized as conditional variables. Several statistical metrics and receiver operating characteristic curves were utilized to evaluate the accuracy of the models. The results showed that the RF model, with a ROC of 0.984, had the highest prediction capability in the research area.</div><div>We generated a susceptibility map using the RF model, and the study area was classified into high susceptibility (H) and low susceptibility (L) areas. Confusion Matrix (CM) and Matthews Correlation Coefficient (MCC) were used to confirm the results of the sinkhole susceptibility map's classification. We present a model that predicts sinkhole distribution in the study area, and the output of our model is consistent with the sinkhole hazard map that the Florida Division of Emergency Management had previously created. This work can assist the government, community, and land managers in creating plans for mitigating hazards and land degradation.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"27 ","pages":"Article 100262"},"PeriodicalIF":2.6,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144514348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rushan Wang , Martin Ziegler , Michele Volpi , Andrea Manconi
{"title":"Advanced identification of geological discontinuities with deep learning","authors":"Rushan Wang , Martin Ziegler , Michele Volpi , Andrea Manconi","doi":"10.1016/j.acags.2025.100256","DOIUrl":"10.1016/j.acags.2025.100256","url":null,"abstract":"<div><div>Rock mass characterization is essential for various applications in geosciences. Traditional methods, such as manual mapping and interpretation, are labor-intensive and prone to inconsistencies. Although machine learning has advanced in many fields, its application in structural geology, especially for distinguishing different discontinuity types, remains limited. This study presents a deep learning-based approach for identifying geological discontinuities in borehole images, classifying features such as intact walls, induced cracks, and tectonic fault planes, among others. We evaluate deep learning architectures, including standard Convolutional Neural Networks and Transformer-based models, and optimize segmentation performance with multi-scale training, tiling strategies, and tailored loss functions. Our results demonstrate that the Transformer model, particularly SegFormer, outperforms U-Net in detecting complex geological features. The combined use of weighted cross-entropy and focal loss further improves model robustness, especially for underrepresented and challenging features. In addition, the choice of the tiling size significantly affects the classification performance of different geological features. This research establishes an efficient and accurate pipeline for automated geological interpretation, with significant implications for subsurface exploration and geotechnical engineering.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"27 ","pages":"Article 100256"},"PeriodicalIF":2.6,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144514346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weilin Chen, Jiyin Zhang, Wenjia Li, Xiang Que, Chenhao Li, Xiaogang Ma
{"title":"Integrating neuro-symbolic AI and knowledge graph for enhanced geochemical prediction in copper deposits","authors":"Weilin Chen, Jiyin Zhang, Wenjia Li, Xiang Que, Chenhao Li, Xiaogang Ma","doi":"10.1016/j.acags.2025.100259","DOIUrl":"10.1016/j.acags.2025.100259","url":null,"abstract":"<div><div>The integration of machine learning (ML) and deep learning (DL) in geoscience has demonstrated great promise for mineral prediction. However, existing approaches are predominantly data-driven and often overlook expert geological knowledge, limiting their interpretability, accuracy, and practical applicability. This study introduces a new method that combines Large Language Models (LLMs), knowledge graphs (KGs), and Neuro-Symbolic AI (NSAI) models to predict mineralization systems in diverse copper deposits, significantly increasing the precision in prediction results. We utilize LLMs to generate KGs from geological literature, extracting symbolic rules that encode domain-specific insights about copper mineralization. These rules, derived dynamically from expert knowledge, are integrated into ML models as guidance during the training and prediction phases. By fusing symbolic reasoning with ML's computational power, our approach overcomes the limitations of black-box models, offering both improved accuracy and transparency in mineral prediction. To validate this method, we apply it to a comprehensive geochemical dataset of global copper deposits. The results show that rule-guided ML models achieve notable performance improvements, outperforming traditional ML methods in accuracy, precision, and robustness. Interpretability is further enhanced by using tools such as SHAP values, which explain the influence of individual geochemical features within the rule-based framework. This combination not only identifies critical geochemical elements like Cu, Fe, and S but also provides coherent, domain-aligned explanations for the predicted mineralization patterns. Our findings demonstrate the transformative potential of combining LLMs, KGs, and ML models for mineral prediction. This hybrid approach enables geoscientists to leverage both computational and expert knowledge, achieving a deeper understanding of mineralization systems.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"27 ","pages":"Article 100259"},"PeriodicalIF":2.6,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144331437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Ashar Hussain , Venkatesh Budamala , Rajarshi Das Bhowmik
{"title":"Application of machine learning-based post-processing to improve crowd-sourced urban rainfall categorizations","authors":"Mohammad Ashar Hussain , Venkatesh Budamala , Rajarshi Das Bhowmik","doi":"10.1016/j.acags.2025.100255","DOIUrl":"10.1016/j.acags.2025.100255","url":null,"abstract":"<div><div>In recent years, citizen science has gained significant attention in the hydrometeorological sciences as an alternative to traditional monitoring systems while also raising awareness of natural processes. Crowd participation in reporting rainfall, known as crowdsourcing rainfall, has the potential to provide insights into the spatio-temporal variability of urban rainfall. However, crowdsourcing often suffers from inaccuracies in rainfall classification due to inadequately trained participants. This study investigates whether machine learning models can reduce misclassification in crowd-sourced rainfall reports under a synthetic framework. A state-of-the-art stochastic rainfall generator is deployed to simulate high-resolution rainfall over Bangalore, India, traditionally monitored by only two rain gauge stations. The study assumes that the 'synthetic' crowd reports qualitative descriptions of two rainfall characteristics—intensity and duration—based on which a categorization of a rainfall event (normal/moderate/severe) is issued. Ten scenarios are introduced to represent varying degrees of misclassification in the crowd reports. Two machine learning models, random forest and logistic regression, are employed to address these misclassifications and improve the resulting rainfall categorization. The findings indicate that while the random forest model outperforms logistic regression, its performance declines as misclassification rates increase. Moreover, the study highlights that increasing the number of participants significantly enhances the post-processing performance, emphasizing the importance of properly training the crowd for accurate reporting.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"26 ","pages":"Article 100255"},"PeriodicalIF":2.6,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Suchanun Piriyasatit , Ercan Engin Kuruoglu , Mehmet Sinan Ozeren
{"title":"Comparison of ETAS parameter estimates across different time windows within the North and East Anatolian Fault Zones, Turkey","authors":"Suchanun Piriyasatit , Ercan Engin Kuruoglu , Mehmet Sinan Ozeren","doi":"10.1016/j.acags.2025.100253","DOIUrl":"10.1016/j.acags.2025.100253","url":null,"abstract":"<div><div>Located at the intersection of major lithospheric plates, Turkey is characterized by significant seismic activity, particularly along the North Anatolian Fault (NAF) and East Anatolian Fault (EAF). This paper employs the Epidemic-Type Aftershock Sequence (ETAS) model, fitted using the BFGS quasi-Newton method, to study earthquake triggering processes along these faults from 1990 to 2023. Our findings show distinct temporal variations in seismicity parameters along these faults. Along the NAF, the ETAS model highlighted a lower background seismicity rate (<span><math><mi>μ</mi></math></span>) and aftershock productivity (<span><math><msub><mrow><mi>K</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span>) compared to the EAF. In contrast, the EAF exhibits lower magnitude sensitivity (<span><math><mi>α</mi></math></span>), indicating that smaller earthquakes are more likely to trigger aftershocks, due to weaker dependence on mainshock magnitude. The aftershock decay rate (<span><math><mi>p</mi></math></span>) is notably faster in the NAF, suggesting quicker post-event stabilization. Our analysis across different time windows reveals significant non-stationarities in ETAS parameters, indicating that seismic behaviors along these faults do not strictly follow historical patterns. This temporal variability highlights the challenges in short-term seismic forecasting using historical data alone. A detailed comparison of ETAS parameters across time frames showcases the necessity for incorporating dynamic modeling approaches to improve earthquake forecasting in seismically active regions.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"26 ","pages":"Article 100253"},"PeriodicalIF":2.6,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144279364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ali Aouf , Eric Laloy , Bart Rogiers , Christophe De Vleeschouwer
{"title":"3D clay microstructure synthesis using Denoising Diffusion Probabilistic Models","authors":"Ali Aouf , Eric Laloy , Bart Rogiers , Christophe De Vleeschouwer","doi":"10.1016/j.acags.2025.100248","DOIUrl":"10.1016/j.acags.2025.100248","url":null,"abstract":"<div><div>This work is concerned with the challenging task of generating 3D-consistent binary microstructures of heterogeneous clay materials. We leverage denoising diffusion probabilistic models (DDPMs) to do so and show that DDPMs outperform two classical generative adversarial networks (GANs) for a 2D generation task. Next, our experiments demonstrate that our DDPMs can produce high-quality, diverse realizations that well capture the spatial statistics of two distinct clay microstructures. Moreover, we show that DDPMs can be implicitly trained to generate porosity-conditioned samples. To the best of our knowledge, this is the first study that addresses clay microstructure generation with DDPMs.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"26 ","pages":"Article 100248"},"PeriodicalIF":2.6,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144189514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kalpesh R. Patil, Takeshi Doi, J.V. Ratnam, Swadhin K. Behera
{"title":"Enhancing Indian summer monsoon prediction: Deep learning approach for skillful long-lead forecasts of rainfall","authors":"Kalpesh R. Patil, Takeshi Doi, J.V. Ratnam, Swadhin K. Behera","doi":"10.1016/j.acags.2025.100257","DOIUrl":"10.1016/j.acags.2025.100257","url":null,"abstract":"<div><div>The prediction of the Indian summer monsoon rainfall (ISMR) in the June–September (JJAS) season at long-lead times is challenging. The state-of-the-art dynamical models often fail to capture the sign and amplitude of the rainfall anomalies in the extreme rainfall seasons, limiting the overall skill of the models. We attempted to address this issue using a deep learning model based on convolutional neural networks (CNN). An ensemble of JJAS rainfall predictions using the CNN model with a unique custom function showed high skills in predicting ISMR at a long-lead time of 12 months. The predictions had an anomaly correlation coefficient (ACC) exceeding 0.5 at all the lead times from 2 to 17 months. The CNN model predictions could capture the sign and phase of the extreme rainfall events in the study period realistically. Analysis of saliency-based heatmaps indicated the high skill to be due to the model capturing the leading modes of climate variability, such as the Indian Ocean Dipole and El Niño-Southern Oscillation, realistically. The ensemble of CNN ISMR predictions can supplement the predictions of the forecasting centers.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"26 ","pages":"Article 100257"},"PeriodicalIF":2.6,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144291535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data-driven dynamic friction models based on Recurrent Neural Networks","authors":"Gaëtan Cortes, Joaquin Garcia-Suarez","doi":"10.1016/j.acags.2025.100249","DOIUrl":"10.1016/j.acags.2025.100249","url":null,"abstract":"<div><div>In this concise contribution, it is demonstrated that Recurrent Neural Networks (RNNs) based on Gated Recurrent Unit (GRU) architecture, possess the capability to learn the complex dynamics of rate-and-state friction (RSF) laws from synthetic data. The data employed for training the network is generated through the application of traditional RSF equations coupled with either the aging law or the slip law for state evolution. A novel aspect of this approach is the formulation of a loss function that explicitly accounts for the direct effect by means of automatic differentiation. It is found that the GRU-based RNNs effectively learns to predict changes in the friction coefficient resulting from velocity jumps (with and without noise in the target data), thereby showcasing the potential of machine learning models in capturing and simulating the physics of frictional processes. Current limitations and challenges are discussed.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"26 ","pages":"Article 100249"},"PeriodicalIF":2.6,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sungil Kim , Tea-Woo Kim , Yongjun Hong , Hoonyoung Jeong
{"title":"Prediction of carbon dioxide phase at bottomhole by adaptive factorization network considering well geometry","authors":"Sungil Kim , Tea-Woo Kim , Yongjun Hong , Hoonyoung Jeong","doi":"10.1016/j.acags.2025.100254","DOIUrl":"10.1016/j.acags.2025.100254","url":null,"abstract":"<div><div>Accurate carbon dioxide (CO<sub>2</sub>) phase prediction at the bottomhole of injection wells is essential for ensuring safe and efficient CO<sub>2</sub> storage and enhanced gas recovery (EGR). Phase misclassification can cause operational inefficiencies, equipment failure, and compromised storage integrity, posing significant risks to CO<sub>2</sub> injection projects. While previous studies have contributed to CO<sub>2</sub> phase prediction, they have overlooked well geometry effects, which can impact reliability in real-world applications. This study addresses these challenges by introducing a deep learning framework based on the adaptive factorization network (AFN), which enhances CO<sub>2</sub> phase prediction accuracy by leveraging feature interactions. The AFN model was trained on ∼43,000 wells across seven major North American shale gas basins, covering a wide range of well geometries and injection conditions. CO<sub>2</sub> phases were classified into supercritical and dense categories, reflecting prevailing flow conditions. To enhance practical applicability, we incorporated real-field wellbore data, ensuring alignment with actual injection environments. The standard AFN model achieved an F1-score of 0.94, with data augmentation further improving performance by reducing false predictions by 50 % and increasing the F1-score to 0.97. Rigorous validation demonstrated the model's robustness for optimizing wellhead temperature to achieve the desired CO<sub>2</sub> phase transition. By explicitly considering well geometry effects and real-field conditions, this study advances data-driven CO<sub>2</sub> injection modeling, providing a scalable, high-accuracy framework for evaluating CO<sub>2</sub> storage and EGR feasibility.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"26 ","pages":"Article 100254"},"PeriodicalIF":2.6,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144123572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Soil organic carbon retrieval using a machine learning approach from satellite and environmental covariates in the Lower Brazos River Watershed, Texas, USA","authors":"Birhan Getachew Tikuye, Ram Lakhan Ray","doi":"10.1016/j.acags.2025.100252","DOIUrl":"10.1016/j.acags.2025.100252","url":null,"abstract":"<div><div>Soil is critical in global carbon storage, holding more carbon than terrestrial vegetation and the atmosphere combined. Accurate soil organic carbon (SOC) estimation is essential for improving agricultural productivity and mitigating climate change. This study aims to explore the retrieval of SOC using a machine learning (ML) approach, leveraging remote sensing data and environmental covariates, focusing on the Lower Brazos River Watershed, southern Texas, USA. The study used Sentinel 2A satellite data-derived indices such as vegetation and water indices, topographic features, soil properties, and climatic factors. Three ML models, namely Gradient Boosting (GB), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost), were deployed, with performance assessed using the R<sup>2</sup>, RMSE, and MAE. All explanatory variables are geospatial gridded datasets, except for the point-based measurement of SOC on the Prairie View A&M University (PVAMU) research farm plot used to train the model. The RF model demonstrated the best performance in model testing, with the lowest root mean square error (RMSE = 4.17) and mean absolute error (MAE = 3), as well as the highest coefficient of determination (R<sup>2</sup> = 0.78). GB was the second-best performing model, achieving an RMSE of 4.23 and an MAE of 3.12, with similar R<sup>2</sup> values to the RF model. The average SOC throughout the watershed is 45.5 tons/ha, while the total amount of SOC in the watershed is around 4,278,263 tons. These results suggest that integrating satellite data with environmental covariates and machine learning models holds excellent potential for SOC prediction and supports climate change mitigation efforts by improving carbon stock assessments.</div></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"26 ","pages":"Article 100252"},"PeriodicalIF":2.6,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}