Spatial StatisticsPub Date : 2024-06-12DOI: 10.1016/j.spasta.2024.100843
Aritz Adin , Elias Teixeira Krainski , Amanda Lenzi , Zhedong Liu , Joaquín Martínez-Minaya , Håvard Rue
{"title":"Automatic cross-validation in structured models: Is it time to leave out leave-one-out?","authors":"Aritz Adin , Elias Teixeira Krainski , Amanda Lenzi , Zhedong Liu , Joaquín Martínez-Minaya , Håvard Rue","doi":"10.1016/j.spasta.2024.100843","DOIUrl":"https://doi.org/10.1016/j.spasta.2024.100843","url":null,"abstract":"<div><p>Standard techniques such as leave-one-out cross-validation (LOOCV) might not be suitable for evaluating the predictive performance of models incorporating structured random effects. In such cases, the correlation between the training and test sets could have a notable impact on the model’s prediction error. To overcome this issue, an automatic group construction procedure for leave-group-out cross validation (LGOCV) has recently emerged as a valuable tool for enhancing predictive performance measurement in structured models. The purpose of this paper is (i) to compare LOOCV and LGOCV within structured models, emphasizing model selection and predictive performance, and (ii) to provide real data applications in spatial statistics using complex structured models fitted with INLA, showcasing the utility of the automatic LGOCV method. First, we briefly review the key aspects of the recently proposed LGOCV method for automatic group construction in latent Gaussian models. We also demonstrate the effectiveness of this method for selecting the model with the highest predictive performance by simulating extrapolation tasks in both temporal and spatial data analyses. Finally, we provide insights into the effectiveness of the LGOCV method in modeling complex structured data, encompassing spatio-temporal multivariate count data, spatial compositional data, and spatio-temporal geospatial data.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211675324000344/pdfft?md5=58ade5e28808d907246b86bb20b2c270&pid=1-s2.0-S2211675324000344-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141429838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial StatisticsPub Date : 2024-06-04DOI: 10.1016/j.spasta.2024.100837
Chunfeng Huang , Ao Li , Nicholas W. Bussberg , Haimeng Zhang
{"title":"The circular Matérn covariance function and its link to Markov random fields on the circle","authors":"Chunfeng Huang , Ao Li , Nicholas W. Bussberg , Haimeng Zhang","doi":"10.1016/j.spasta.2024.100837","DOIUrl":"https://doi.org/10.1016/j.spasta.2024.100837","url":null,"abstract":"<div><p>The connection between Gaussian random fields and Markov random fields has been well-established in Euclidean spaces, with Matérn covariance functions playing a pivotal role. In this paper, we explore the extension of this link to circular spaces and uncover different results. It is known that Matérn covariance functions are not always positive definite on the circle; however, the circular Matérn covariance functions are shown to be valid on the circle and are the focus of this paper. For these circular Matérn random fields on the circle, we show that the corresponding Markov random fields can be obtained explicitly on equidistance grids. Consequently, the equivalence between the circular Matérn random fields and Markov random fields is then exact and this marks a departure from the Euclidean space counterpart, where only approximations are achieved. Moreover, the key motivation in Euclidean spaces for establishing such link relies on the assumption that the corresponding Markov random field is sparse. We show that such sparsity does not hold in general on the circle. In addition, for the sparse Markov random field on the circle, we derive its corresponding Gaussian random field.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141323908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial StatisticsPub Date : 2024-06-01DOI: 10.1016/j.spasta.2024.100838
Paul May , Hossein Moradi Rekabdarkolaee
{"title":"Dimension reduction for spatial regression: Spatial predictor envelope","authors":"Paul May , Hossein Moradi Rekabdarkolaee","doi":"10.1016/j.spasta.2024.100838","DOIUrl":"10.1016/j.spasta.2024.100838","url":null,"abstract":"<div><p>Natural sciences such as geology and forestry often utilize regression models for spatial data with many predictors and small to moderate sample sizes. In these settings, efficient estimation of the regression parameters is crucial for both model interpretation and prediction. We propose a dimension reduction approach for spatial regression that assumes certain linear combinations of the predictors are immaterial to the regression. The model and corresponding inference provide efficient estimation of regression parameters while accounting for spatial correlation in the data. We employed the maximum likelihood estimation approach to estimate the parameters of the model. The effectiveness of the proposed model is illustrated through simulation studies and the analysis of a geochemical data set, predicting rare earth element concentrations within an oil and gas reserve in Wyoming. Simulation results indicate that our proposed model offers a significant reduction in the mean square errors and variation of the regression coefficients. Furthermore, the method provided a 50% reduction in prediction variance for rare earth element concentrations within our data analysis.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141132058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial StatisticsPub Date : 2024-05-18DOI: 10.1016/j.spasta.2024.100842
Osman Doğan
{"title":"Integrated deviance information criterion for spatial autoregressive models with heteroskedasticity","authors":"Osman Doğan","doi":"10.1016/j.spasta.2024.100842","DOIUrl":"https://doi.org/10.1016/j.spasta.2024.100842","url":null,"abstract":"<div><p>In this study, we introduce the integrated deviance information criterion (DIC) for nested and non-nested model selection problems in heteroskedastic spatial autoregressive models. In a Bayesian estimation setting, we assume that the idiosyncratic error terms of our spatial autoregressive model have a scale mixture of normal distributions, where the scale mixture variables are latent variables that induce heteroskedasticity. We first derive the integrated likelihood function by analytically integrating out the scale mixture variables from the complete-data likelihood function. We then use the integrated likelihood function to formulate the integrated DIC measure. We investigate the finite sample performance of the integrated DIC in selecting the true model in a simulation study. The simulation results show that the integrated DIC performs satisfactorily and can be useful for selecting the correct model in specification search exercises. Finally, in a spatially augmented economic growth model, we use the integrated DIC to choose the spatial weights matrix that leads to better predictive accuracy.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141095822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial StatisticsPub Date : 2024-05-04DOI: 10.1016/j.spasta.2024.100839
Stella Self , Xingpei Zhao , Anja Zgodic , Anna Overby , David White , Alexander C. McLain , Caitlin Dyckman
{"title":"A hypothesis test for detecting spatial patterns in categorical areal data","authors":"Stella Self , Xingpei Zhao , Anja Zgodic , Anna Overby , David White , Alexander C. McLain , Caitlin Dyckman","doi":"10.1016/j.spasta.2024.100839","DOIUrl":"https://doi.org/10.1016/j.spasta.2024.100839","url":null,"abstract":"<div><p>The vast growth of spatial datasets in recent decades has fueled the development of many statistical methods for detecting spatial patterns. Two of the most commonly studied spatial patterns are clustering, loosely defined as datapoints with similar attributes existing close together, and dispersion, loosely defined as the semi-regular placement of datapoints with similar attributes. In this work, we develop a hypothesis test to detect spatial clustering or dispersion at specific distances in categorical areal data. Such data consists of a set of spatial regions whose boundaries are fixed and known (e.g., counties) associated with a categorical random variable (e.g. whether the county is rural, micropolitan, or metropolitan). We propose a method to extend the positive area proportion function (developed for detecting spatial clustering in binary areal data) to the categorical case. This proposal, referred to as the categorical positive areal proportion function test, can detect various spatial patterns, including homogeneous clusters, heterogeneous clusters, and dispersion. Our approach is the first method capable of distinguishing between different types of clustering in categorical areal data. After validating our method using an extensive simulation study, we use the categorical positive area proportion function test to detect spatial patterns in Boulder County, Colorado USA biological, agricultural, built and open conservation easements.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140906483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial StatisticsPub Date : 2024-05-03DOI: 10.1016/j.spasta.2024.100840
Mehdi Moradi, Ali Sharifi
{"title":"Summary statistics for spatio-temporal point processes on linear networks","authors":"Mehdi Moradi, Ali Sharifi","doi":"10.1016/j.spasta.2024.100840","DOIUrl":"https://doi.org/10.1016/j.spasta.2024.100840","url":null,"abstract":"<div><p>We propose novel second/higher-order summary statistics for inhomogeneous spatio-temporal point processes when the spatial locations are limited to a linear network. More specifically, letting the spatial distance between events be measured by a regular distance metric, appropriate forms of <span><math><mi>K</mi></math></span>- and <span><math><mi>J</mi></math></span>-functions are introduced, and their theoretical relationships are studied. The theoretical forms of our proposed summary statistics are investigated under homogeneity, Poissonness, and independent thinning. Moreover, non-parametric estimators are derived, facilitating the use of our proposed summary statistics to study the spatio-temporal dependence between events. Through simulation studies, we demonstrate that our proposed <span><math><mi>J</mi></math></span>-function effectively identifies spatio-temporal clustering, inhibition, and randomness. Finally, we examine spatio-temporal dependencies for street crimes in Valencia, Spain, and traffic accidents in New York, USA.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211675324000319/pdfft?md5=345d0dcd771c5b3d1a2f681cc0522723&pid=1-s2.0-S2211675324000319-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140893753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rapid outlier detection, model selection and variable selection using penalized likelihood estimation for general spatial models","authors":"Yunquan Song, Minglu Fang, Yuanfeng Wang, Yiming Hou","doi":"10.1016/j.spasta.2024.100834","DOIUrl":"https://doi.org/10.1016/j.spasta.2024.100834","url":null,"abstract":"<div><p>The outliers in the data set have a potential influence on the statistical inference and can provide some useful information behind the data set, the methodology for outlier detection and accommodation is always an important topic in data analysis. For spatial data, its influence not only affects coefficient estimation but model selection. The traditional method usually carries out outlier detection, model selection and variable selection step by step, so the data processing efficiency is not high. In order to further improve the efficiency and accuracy of data processing, based on the general spatial model, we consider a technique to achieve outlier detection, along with model and variable estimation in one step. In the general spatial model, we add a mean shift parameter for each data point to identify outliers. Penalized likelihood estimation (PLE) is proposed to simultaneously detect outliers, and to select spatial models and explanatory variables for spatial data. This method correctly identifies multiple outliers, provides a proper spatial model, and corrects coefficient estimation without removing outliers in numerical simulation and case analysis. Compared to current methods, PLE detects outliers more quickly, and solves the optimization problem to select spatial models and explanatory variables. Calculation is easy using the optimized solnp function in R software.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140815588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial StatisticsPub Date : 2024-04-27DOI: 10.1016/j.spasta.2024.100833
Jie Li, Yunquan Song
{"title":"Incremental transfer learning for spatial autoregressive model with linear constraints","authors":"Jie Li, Yunquan Song","doi":"10.1016/j.spasta.2024.100833","DOIUrl":"https://doi.org/10.1016/j.spasta.2024.100833","url":null,"abstract":"<div><p>Transfer learning is generally regarded as a beneficial technique for utilizing external information to enhance learning performance on target tasks. However, current research on transfer learning in high-dimensional regression models does not take into account both the location information of the data and the explicit utilization of prior knowledge. In the framework of transfer learning, this study seeks to resolve the spatial autoregressive problem and investigate the impact of introducing linear constraints. In this paper, a two-step transfer learning approach and a transferable source detection algorithm based on cross-validation are proposed when the input dimensions of the source and target datasets are the same. When the input dimensions are different, this paper suggests a straightforward and workable incremental transfer learning method. Additionally, for the estimating model developed under this method, Karush–Kuhn–Tucker (KKT) conditions and degrees of freedom are determined, and a Bayesian Information Criterion (BIC) is created for choosing hyperparameters. The effectiveness of the proposed methods is proven by numerical calculations, and the performance of the model in transfer learning estimation is improved by the addition of linear constraints.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140824975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial StatisticsPub Date : 2024-04-25DOI: 10.1016/j.spasta.2024.100826
Lily Wang , Guannan Wang , Annie S. Gao
{"title":"Exploring heterogeneity and dynamics of meteorological influences on US PM2.5: A distributed learning approach with spatiotemporal varying coefficient models","authors":"Lily Wang , Guannan Wang , Annie S. Gao","doi":"10.1016/j.spasta.2024.100826","DOIUrl":"10.1016/j.spasta.2024.100826","url":null,"abstract":"<div><p>Particulate matter (PM) has emerged as a primary air quality concern due to its substantial impact on human health. Many recent research works suggest that PM<sub>2.5</sub> concentrations depend on meteorological conditions. Enhancing current pollution control strategies necessitates a more holistic comprehension of PM<sub>2.5</sub> dynamics and the precise quantification of spatiotemporal heterogeneity in the relationship between meteorological factors and PM<sub>2.5</sub> levels. The spatiotemporal varying coefficient model stands as a prominent spatial regression technique adept at addressing this heterogeneity. Amidst the challenges posed by the substantial scale of modern spatiotemporal datasets, we propose a pioneering distributed estimation method (DEM) founded on multivariate spline smoothing across a domain’s triangulation. This DEM algorithm ensures an easily implementable, highly scalable, and communication-efficient strategy, demonstrating almost linear speedup potential. We validate the effectiveness of our proposed DEM through extensive simulation studies, demonstrating that it achieves coefficient estimations akin to those of global estimators derived from complete datasets. Applying the proposed model and method to the US daily PM<sub>2.5</sub> and meteorological data, we investigate the influence of meteorological variables on PM<sub>2.5</sub> concentrations, revealing both spatial and seasonal variations in this relationship.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140766668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial StatisticsPub Date : 2024-04-18DOI: 10.1016/j.spasta.2024.100824
Blerta Begu , Simone Panzeri , Eleonora Arnone , Michelle Carey , Laura M. Sangalli
{"title":"A nonparametric penalized likelihood approach to density estimation of space–time point patterns","authors":"Blerta Begu , Simone Panzeri , Eleonora Arnone , Michelle Carey , Laura M. Sangalli","doi":"10.1016/j.spasta.2024.100824","DOIUrl":"10.1016/j.spasta.2024.100824","url":null,"abstract":"<div><p>In this work, we consider space–time point processes and study their continuous space–time evolution. We propose an innovative nonparametric methodology to estimate the unknown space–time density of the point pattern, or, equivalently, to estimate the intensity of an inhomogeneous space–time Poisson point process. The presented approach combines maximum likelihood estimation with roughness penalties, based on differential operators, defined over the spatial and temporal domains of interest. We first establish some important theoretical properties of the considered estimator, including its consistency. We then develop an efficient and flexible estimation procedure that leverages advanced numerical and computation techniques. Thanks to a discretization based on finite elements in space and B-splines in time, the proposed method can effectively capture complex multi-modal and strongly anisotropic spatio-temporal point patterns; moreover, these point patterns may be observed over planar or curved domains with non-trivial geometries, due to geographic constraints, such as coastal regions with complicated shorelines, or curved regions with complex orography. In addition to providing estimates, the method’s functionalities also include the introduction of appropriate uncertainty quantification tools. We thoroughly validate the proposed method, by means of simulation studies and applications to real-world data. The obtained results highlight significant advantages over state-of-the-art competing approaches.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211675324000150/pdfft?md5=fcab55472ed3f4b5aa4f0e9b44fe624a&pid=1-s2.0-S2211675324000150-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140757441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}