Emanuele Barca, Maria Clementina Caputo, Rita Masciale
{"title":"Building the optimal hybrid spatial Data-Driven Model: Balancing accuracy and complexity","authors":"Emanuele Barca, Maria Clementina Caputo, Rita Masciale","doi":"10.1016/j.jag.2025.104478","DOIUrl":null,"url":null,"abstract":"<div><div>Mapping environmental variables is crucial for natural resource management. Researchers and scholars have continually advanced this field with modern techniques such as Integrated Nested Laplace Approximation (INLA), Deep Learning (DL), and Graph Neural Networks (GNN) models. While effective, these models often present a significant challenge due to their <em>black</em> nature, which obscures the process of generating final maps from raw data. Recent theoretical breakthroughs have shown that white/grey-box models can achieve the same level of accuracy as these advanced techniques, debunking the belief that complex models are necessarily the most accurate. Based on these findings, we have developed a methodology that employs a series of statistical tests and data analytics to identify essential features hidden in spatial data in order to assess the predictive model (of white/grey kind) that best approximates underlying spatial processes. This methodology profiles the model that better adapts to the data, aiding in the selection of the simplest model that achieves the desired accuracy, functioning similarly to a recommender system for model selection. Furthermore, the set of permissible models includes only regressive-like ones to clarify the data’s contribution to map construction and can be applied to a wide range of datasets. By reducing complexity, this approach enhances the transparency of the model’s results. Real-world dataset demonstrates this methodology’s remarkable ability to produce highly accurate results.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"139 ","pages":"Article 104478"},"PeriodicalIF":7.6000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569843225001256","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 0
Abstract
Mapping environmental variables is crucial for natural resource management. Researchers and scholars have continually advanced this field with modern techniques such as Integrated Nested Laplace Approximation (INLA), Deep Learning (DL), and Graph Neural Networks (GNN) models. While effective, these models often present a significant challenge due to their black nature, which obscures the process of generating final maps from raw data. Recent theoretical breakthroughs have shown that white/grey-box models can achieve the same level of accuracy as these advanced techniques, debunking the belief that complex models are necessarily the most accurate. Based on these findings, we have developed a methodology that employs a series of statistical tests and data analytics to identify essential features hidden in spatial data in order to assess the predictive model (of white/grey kind) that best approximates underlying spatial processes. This methodology profiles the model that better adapts to the data, aiding in the selection of the simplest model that achieves the desired accuracy, functioning similarly to a recommender system for model selection. Furthermore, the set of permissible models includes only regressive-like ones to clarify the data’s contribution to map construction and can be applied to a wide range of datasets. By reducing complexity, this approach enhances the transparency of the model’s results. Real-world dataset demonstrates this methodology’s remarkable ability to produce highly accurate results.
期刊介绍:
The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.