Philipp Otto, Alessandro Fusta Moro, Jacopo Rodeschini, Qendrim Shaboviq, Rosaria Ignaccolo, Natalia Golini, Michela Cameletti, Paolo Maranzano, Francesco Finazzi, Alessandro Fassò
{"title":"Spatiotemporal modelling of $$hbox {PM}_{2.5}$$ concentrations in Lombardy (Italy): a comparative study","authors":"Philipp Otto, Alessandro Fusta Moro, Jacopo Rodeschini, Qendrim Shaboviq, Rosaria Ignaccolo, Natalia Golini, Michela Cameletti, Paolo Maranzano, Francesco Finazzi, Alessandro Fassò","doi":"10.1007/s10651-023-00589-0","DOIUrl":"https://doi.org/10.1007/s10651-023-00589-0","url":null,"abstract":"<p>This study presents a comparative analysis of three predictive models with an increasing degree of flexibility: hidden dynamic geostatistical models (HDGM), generalised additive mixed models (GAMM), and the random forest spatiotemporal kriging models (RFSTK). These models are evaluated for their effectiveness in predicting <span>(text {PM}_{2.5})</span> concentrations in Lombardy (North Italy) from 2016 to 2020. Despite differing methodologies, all models demonstrate proficient capture of spatiotemporal patterns within air pollution data with similar out-of-sample performance. Furthermore, the study delves into station-specific analyses, revealing variable model performance contingent on localised conditions. Model interpretation, facilitated by parametric coefficient analysis and partial dependence plots, unveils consistent associations between predictor variables and <span>(text {PM}_{2.5})</span> concentrations. Despite nuanced variations in modelling spatiotemporal correlations, all models effectively accounted for the underlying dependence. In summary, this study underscores the efficacy of conventional techniques in modelling correlated spatiotemporal data, concurrently highlighting the complementary potential of Machine Learning and classical statistical approaches.</p>","PeriodicalId":50519,"journal":{"name":"Environmental and Ecological Statistics","volume":"50 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139667425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discrete Beta and Shifted Beta-Binomial models for rating and ranking data","authors":"Mariangela Sciandra, Salvatore Fasola, Alessandro Albano, Chiara Di Maria, Antonella Plaia","doi":"10.1007/s10651-023-00592-5","DOIUrl":"https://doi.org/10.1007/s10651-023-00592-5","url":null,"abstract":"<p>Ranking and rating methods for preference data result in a different underlying organization of data that can lead to manifold probabilistic approaches to data modelling. As an alternative to existing approaches, two new flexible probability distributions are discussed as a modelling framework: the <i>Discrete Beta</i> and the <i>Shifted Beta-Binomial</i>. Through the presentation of three real-world examples, we demonstrate the practical utility of these distributions. These illustrative cases show how these novel distributions can effectively address real-world challenges, with a particular focus on data derived from surveys concerning environmental issues. Our analysis highlights the new distributions’ capability to capture the inherent structures within preference data, offering valuable insights into the field.</p>","PeriodicalId":50519,"journal":{"name":"Environmental and Ecological Statistics","volume":"19 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139585893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advances in Kth nearest-neighbour clutter removal","authors":"Nicoletta D’Angelo","doi":"10.1007/s10651-023-00588-1","DOIUrl":"https://doi.org/10.1007/s10651-023-00588-1","url":null,"abstract":"<p>We consider the problem of feature detection in the presence of clutter in spatial point processes. Classification methods have been developed in previous studies. Among these, Byers and Raftery (J Am Stat Assoc 93(442):577–584, 1998) models the observed <i>K</i>th nearest neighbour distances as a mixture distribution and classifies the <i>clutter</i> and <i>feature</i> points consequently. In this paper, we enhance such approach in two manners. First, we propose an automatic procedure for selecting the number of nearest neighbours to consider in the classification method by means of segmented regression models. Secondly, with the aim of applying the procedure multiple times to get a “better\" end result, we propose a stopping criterion that minimizes the overall entropy measure of cluster separation between clutter and feature points. The proposed procedures are suitable for a feature with clutter as two superimposed Poisson processes on any space, including linear networks. We present simulations and two case studies of environmental data to illustrate the method.</p>","PeriodicalId":50519,"journal":{"name":"Environmental and Ecological Statistics","volume":"45 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139557249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Logistic regression versus XGBoost for detecting burned areas using satellite images","authors":"","doi":"10.1007/s10651-023-00590-7","DOIUrl":"https://doi.org/10.1007/s10651-023-00590-7","url":null,"abstract":"<h3>Abstract</h3> <p>Classical statistical methods prove advantageous for small datasets, whereas machine learning algorithms can excel with larger datasets. Our paper challenges this conventional wisdom by addressing a highly significant problem: the identification of burned areas through satellite imagery, that is a clear example of imbalanced data. The methods are illustrated in the North-Central Portugal and the North-West of Spain in October 2017 within a multi-temporal setting of satellite imagery. Daily satellite images are taken from Moderate Resolution Imaging Spectroradiometer (MODIS) products. Our analysis shows that a classical Logistic regression (LR) model competes on par, if not surpasses, a widely employed machine learning algorithm called the extreme gradient boosting algorithm (XGBoost) within this particular domain.</p>","PeriodicalId":50519,"journal":{"name":"Environmental and Ecological Statistics","volume":"19 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139509837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Flash floods in Mediterranean catchments: a meta-model decision support system based on Bayesian networks","authors":"Rosa F. Ropero, M. Julia Flores, Rafael Rumí","doi":"10.1007/s10651-023-00587-2","DOIUrl":"https://doi.org/10.1007/s10651-023-00587-2","url":null,"abstract":"<p>Natural disasters, especially those related to water—like storms and floods—have increased over the last decades both in number and intensity. Under the current Climate Change framework, several reports predict an increase in the intensity and duration of these extreme climatic events, where the Mediterranean area would be one of the most affected. This paper develops a decision support system based on Bayesian inference able to predict a flood alert in Andalusian Mediterranean catchments. The key point is that, using simple weather forecasts and live measurements of river level, we can get a flood-alert several hours before it happens. A set of models based on Bayesian networks was learnt for each of the catchments included in the study area, and joined together into a more complex model based on a rule system. This final meta-model was validated using data from both non-extreme and extreme storm events. Results show that the methodology proposed provides an accurate forecast of the flood situation of the greatest catchment areas of Andalusia.</p>","PeriodicalId":50519,"journal":{"name":"Environmental and Ecological Statistics","volume":"82 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139422290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A zero-inflated model for spatiotemporal count data with extra zeros: application to 1950–2015 tornado data in Kansas","authors":"Hong-Ding Yang, Audrey Chang, Wei-Wen Hsu, Chun-Shu Chen","doi":"10.1007/s10651-023-00586-3","DOIUrl":"https://doi.org/10.1007/s10651-023-00586-3","url":null,"abstract":"<p>In many tornado climate studies, the number of tornado touchdowns is often the primary outcome of interest. These outcome measures are usually generated under a spatiotemporal correlation structure and contains many zeros due to the rarity of tornado occurrence at a specific location and time interval. To model the spatiotemporal count data with excess zeros, we propose a spatiotemporal zero-inflated Poisson (ZIP) model, which lends itself to ease of interpretation and computational simplicity. Technically, we embed a modified conditional autoregressive model in the ZIP model to describe the spatial and temporal correlations, where the probability of a pure zero in the ZIP is purposely designed to depend on locations but independent of time. Illustrated with the longitudinal tornado touchdown data in the state of Kansas from 1950 to 2015, our model suggests that the spatial correlation among the counties and the corresponding temperature are significant factors attributed to the tornado touchdowns. Through the model, we can also estimate the probabilities of no tornado touchdowns for each county over time. These estimated probabilities substantially help us understand the pattern of touchdowns and further identify the risk areas across Kansas. Moreover, these estimates can be iteratively updated when more current touchdown data are available. The final model for Kansas tornado touchdown data is evaluated using more recent data.</p>","PeriodicalId":50519,"journal":{"name":"Environmental and Ecological Statistics","volume":"78 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139030333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Are the income and price elasticities of economy-wide electricity demand in middle-income countries time-varying? Evidence from panels and individual countries","authors":"Brantley Liddle, Fakhri Hasanov","doi":"10.1007/s10651-023-00585-4","DOIUrl":"https://doi.org/10.1007/s10651-023-00585-4","url":null,"abstract":"","PeriodicalId":50519,"journal":{"name":"Environmental and Ecological Statistics","volume":"25 1","pages":"827 - 849"},"PeriodicalIF":3.8,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138533560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Some statistical problems involved in forecasting and estimating the spread of SARS-CoV-2 using Hawkes point processes and SEIR models","authors":"Frederic Schoenberg","doi":"10.1007/s10651-023-00591-6","DOIUrl":"https://doi.org/10.1007/s10651-023-00591-6","url":null,"abstract":"<p>This article reviews some of the statistical issues involved with modeling SARS-CoV02 (Covid-19) in Los Angeles County, California, using Hawkes point process models and SEIR models. The two types of models are compared, and their pros and cons are discussed. We also discuss particular statistical decisions, such as where to place the upper limits on y-axes, and whether to use a Bayesian or frequentist version of the model, how to estimate seroprevalence, and fitting the density of transmission times in the Hawkes model.</p>","PeriodicalId":50519,"journal":{"name":"Environmental and Ecological Statistics","volume":"13 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138533608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Omer Ozturk, Blair L. Robertson, Olena Kravchuk, Jennifer Brown
{"title":"Trade-off between efficiency and variance estimation of spatially balanced augmented samples","authors":"Omer Ozturk, Blair L. Robertson, Olena Kravchuk, Jennifer Brown","doi":"10.1007/s10651-023-00582-7","DOIUrl":"https://doi.org/10.1007/s10651-023-00582-7","url":null,"abstract":"<p>In this paper, we construct three types of augmented samples, which are samples generated from two separate randomization events. The first type combines a simple random sample (<i>SRS</i>) with a spatially balanced sample (<i>SBS</i>) selected from the same finite population. The second type combines an <i>SBS</i> with an <i>SRS</i>. The third type combines two spatially balanced samples. The simple random sample is constructed without replacement and does not contain any ties. The spatially balanced samples are constructed using the properties of the Halton sequence. We provide the first and second order inclusion probabilities for the augmented samples. Next, using the inclusion probabilities of the augmented samples, we construct estimators for the mean and total of a finite population. The efficiency of the augmented samples varies between the efficiency of <i>SRS</i> and <i>SBS</i> samples. If the number of <i>SRS</i> observations in the augmented sample is large, the efficiency is closer to the efficiency of <i>SRS</i>. Otherwise, it is closer to the efficiency of <i>SBS</i>. We also provide estimators for the variances of the estimators of population total of augmented samples. The stability of these variance estimators depends on the proportion of <i>SRS</i> observations in the augmented samples. The larger number of <i>SRS</i> observations lead to stable variance estimators.</p>","PeriodicalId":50519,"journal":{"name":"Environmental and Ecological Statistics","volume":"34 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138533575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finding the number of latent states in hidden Markov models using information criteria","authors":"Jodie Buckby, Ting Wang, David Fletcher, Jiancang Zhuang, Akiko Takeo, Kazushige Obara","doi":"10.1007/s10651-023-00584-5","DOIUrl":"https://doi.org/10.1007/s10651-023-00584-5","url":null,"abstract":"<p>Hidden Markov models (HMMs) are often used to model time series data and are applied in many fields of research. However, estimating the unknown number of hidden states in the Markov chain is a non-trivial component of HMM model selection and an area of active research. Currently, AIC and BIC are commonly used for this purpose, despite theoretical issues and some evidence of poor performance in the literature. Here, motivated by the HMMs developed to model seismic tremor data, we use simulation studies to compare the performance of a number of model selection information criteria when used to select the number of hidden states in HMMs, including an adjusted BIC not previously used with HMMs. We find that AIC and BIC are not always reliable tools for selecting the number of hidden states in HMMs and that other information criteria such as adjusted BIC can actually perform better, depending on factors such as sample size and sojourn times in each state. We apply the information criteria to a set of HMMs fitted to seismic tremor data and compare the models selected by the different criteria.</p>","PeriodicalId":50519,"journal":{"name":"Environmental and Ecological Statistics","volume":"101 1 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138533632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}