Statistical Modelling最新文献

A statistical modelling approach to feedforward neural network model selection 前馈神经网络模型选择的统计建模方法

IF 1 4区数学

Statistical Modelling Pub Date : 2024-09-17 DOI: 10.1177/1471082x241258261

Andrew McInerney, Kevin Burke

{"title":"A statistical modelling approach to feedforward neural network model selection","authors":"Andrew McInerney, Kevin Burke","doi":"10.1177/1471082x241258261","DOIUrl":"https://doi.org/10.1177/1471082x241258261","url":null,"abstract":"Feedforward neural networks (FNNs) can be viewed as non-linear regression models, where covariates enter the model through a combination of weighted summations and non-linear functions. Although these models have some similarities to the approaches used within statistical modelling, the majority of neural network research has been conducted outside of the field of statistics. This has resulted in a lack of statistically based methodology, and, in particular, there has been little emphasis on model parsimony. Determining the input layer structure is analogous to variable selection, while the structure for the hidden layer relates to model complexity. In practice, neural network model selection is often carried out by comparing models using out-of-sample performance. However, in contrast, the construction of an associated likelihood function opens the door to information-criteria-based variable and architecture selection. A novel model selection method, which performs both input- and hidden-node selection, is proposed using the Bayesian information criterion (BIC) for FNNs. The choice of BIC over out-of-sample performance as the model selection objective function leads to an increased probability of recovering the true model, while parsimoniously achieving favourable out-of-sample performance. Simulation studies are used to evaluate and justify the proposed method, and applications on real data are investigated.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"119 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Skellam distribution revisited: Estimating the unobserved incoming and outgoing ICU COVID-19 patients on a regional level in Germany 重新审视斯凯拉姆分布：估算德国地区一级未观察到的 ICU COVID-19 病人进出情况

IF 1 4区数学

Statistical Modelling Pub Date : 2024-05-27 DOI: 10.1177/1471082x241235024

Martje Rave, Göran Kauermann

{"title":"The Skellam distribution revisited: Estimating the unobserved incoming and outgoing ICU COVID-19 patients on a regional level in Germany","authors":"Martje Rave, Göran Kauermann","doi":"10.1177/1471082x241235024","DOIUrl":"https://doi.org/10.1177/1471082x241235024","url":null,"abstract":"With the beginning of the COVID-19 pandemic, we became aware of the need for comprehensive data collection and its provision to scientists and experts for proper data analyses. In Germany, the Robert Koch Institute (RKI) has tried to keep up with this demand for data on COVID-19, but there were (and still are) relevant data missing that are needed to understand the whole picture of the pandemic. In this article, we take a closer look at the severity of the course of COVID-19 in Germany, for which ideal information would be the number of incoming patients to ICU units. This information was (and still is) not available. Instead, the current occupancy of ICU units on the district level was reported daily. We demonstrate how this information can be used to predict the number of incoming as well as released COVID-19 patients using a stochastic version of the Expectation Maximization algorithm (SEM). This, in turn, allows for estimating the influence of district-specific and age-specific infection rates as well as further covariates, including spatial effects, on the number of incoming patients. The article demon-strates that even if relevant data are not recorded or provided officially, statistical modelling allows for reconstructing them. This also includes the quantification of uncertainty which naturally results from the application of the SEM algorithm.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"67 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A novel mixture model for characterizing human aiming performance data 用于描述人类瞄准表演数据特征的新型混合模型

IF 1 4区数学

Statistical Modelling Pub Date : 2024-04-25 DOI: 10.1177/1471082x241234139

Yanxi Li, Derek S. Young, Julien Gori, Olivier Rioul

{"title":"A novel mixture model for characterizing human aiming performance data","authors":"Yanxi Li, Derek S. Young, Julien Gori, Olivier Rioul","doi":"10.1177/1471082x241234139","DOIUrl":"https://doi.org/10.1177/1471082x241234139","url":null,"abstract":"Fitts’ law is often employed as a predictive model for human movement, especially in the field of human-computer interaction. Models with an assumed Gaussian error structure are usually adequate when applied to data collected from controlled studies. However, observational data (often referred to as data gathered ‘in the wild’) typically display noticeable positive skewness relative to a mean trend as users do not routinely try to minimize their task completion time. As such, the exponentially modified Gaussian (EMG) regression model has been applied to aimed movements data. However, it is also of interest to reasonably characterize those regions where a user likely was not trying to minimize their task completion time. In this article, we propose a novel model with a two-component mixture structure—one Gaussian and one exponential—on the errors to identify such a region. An expectation-conditional-maximization (ECM) algorithm is developed for estimation of such a model and some properties of the algorithm are established. The efficacy of the proposed model, as well as its ability to inform model-based clustering, are addressed in this work through extensive simulations and an insightful analysis of a human aiming performance study.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"101 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140801940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fast, effective, and coherent time series modelling using the sparsity-ranked lasso 利用稀疏性排序套索进行快速、有效和连贯的时间序列建模

IF 1 4区数学

Statistical Modelling Pub Date : 2024-03-08 DOI: 10.1177/1471082x231225307

Ryan Peterson, Joseph Cavanaugh

{"title":"Fast, effective, and coherent time series modelling using the sparsity-ranked lasso","authors":"Ryan Peterson, Joseph Cavanaugh","doi":"10.1177/1471082x231225307","DOIUrl":"https://doi.org/10.1177/1471082x231225307","url":null,"abstract":"The sparsity-ranked lasso (SRL) has been developed for model selection and estimation in the presence of interactions and polynomials. The main tenet of the SRL is that an algorithm should be more sceptical of higher-order polynomials and interactions a priori compared to main effects, and hence the inclusion of these more complex terms should require a higher level of evidence. In time series, the same idea of ranked prior scepticism can be applied to characterize the potentially complex seasonal autoregressive (AR) structure of a series during the model fitting process, becoming especially useful in settings with uncertain or multiple modes of seasonality. The SRL can naturally incorporate exogenous variables, with streamlined options for inference and/or feature selection. The fitting process is quick even for large series with a high-dimensional feature set. In this work, we discuss both the formulation of this procedure and the software we have developed for its implementation via the fastTS R package. We explore the performance of our SRL-based approach in a novel application involving the autoregressive modelling of hourly emergency room arrivals at the University of Iowa Hospitals and Clinics. We find that the SRL is considerably faster than its competitors, while generally producing more accurate predictions.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"55 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140071345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Taking advantage of sampling designs in spatial small-area survey studies 在小区域空间调查研究中利用抽样设计的优势

IF 1 4区数学

Statistical Modelling Pub Date : 2024-03-05 DOI: 10.1177/1471082x231226287

Carlos Vergara-Hernández, Marc Marí-Dell’Olmo, Laura Oliveras, Miguel Angel Martinez-Beneito

引用次数: 0

Copula-based pairwise estimator for quantile regression with hierarchical missing data 基于 Copula 的分层缺失数据量化回归成对估计器

IF 1 4区数学

Statistical Modelling Pub Date : 2024-02-28 DOI: 10.1177/1471082x231225806

Anneleen Verhasselt, Alvaro J. Flórez, Geert Molenberghs, Ingrid Van Keilegom

引用次数: 0

Impact of jittering on raster- and distance-based geostatistical analyses of DHS data 抖动对基于栅格和距离的人口与健康调查数据地理统计分析的影响