Alfonso Russo, Alessio Farcomeni, Maria Grazia Pittau, Roberto Zelli
{"title":"Covariate-modulated rectangular latent Markov models with an unknown number of regime profiles","authors":"Alfonso Russo, Alessio Farcomeni, Maria Grazia Pittau, Roberto Zelli","doi":"10.1177/1471082x221127732","DOIUrl":"https://doi.org/10.1177/1471082x221127732","url":null,"abstract":"We derive a multivariate latent Markov model with number of latent states that can possibly change at each time point. We model both the manifest and latent distributions conditionally on explanato...","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138539639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semi-parametric hidden Markov model for large-scale multiple testing under dependency","authors":"Joungyoun Kim, Johan Lim, Jong Soo Lee","doi":"10.1177/1471082x221121235","DOIUrl":"https://doi.org/10.1177/1471082x221121235","url":null,"abstract":"In this article, we propose a new semiparametric hidden Markov model (HMM) for use in the simultaneous hypothesis testing with dependency. The semi- or non-parametric HMM in the literature requires two conditions for its model identifiability, (a) the latent Markov chain (MC) is ergodic and its transition probability is full rank and (b) the observational distributions of different hidden states are disjoint or linearly independent. Unlike the existing models, our semiparametric HMM with two hidden states makes no assumption on the transition probability of the latent MC but assumes that observational distributions are extremal for the set of all stationary distributions of the model. To estimate the model, we propose a modified expectation-maximization algorithm, whose M-step has an additional purification step to make the observational distribution be extremal one. We numerically investigate the performance of the proposed procedure in the estimation of the model and compare it to two recent existing methods in various multiple testing error settings. In addition, we apply our procedure to analyzing two real data examples, the gas chromatography/mass spectrometry experiment to differentiate the origin of herbal medicine and the epidemiologic surveillance of an influenza-like illness.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45709451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jordache Ramjith, Andreas Bender, Kit C. B. Roes, Marianne A. Jonker
{"title":"Recurrent events analysis with piece-wise exponential additive mixed models","authors":"Jordache Ramjith, Andreas Bender, Kit C. B. Roes, Marianne A. Jonker","doi":"10.1177/1471082x221117612","DOIUrl":"https://doi.org/10.1177/1471082x221117612","url":null,"abstract":"Recurrent events analysis plays an important role in many applications, including the study of chronic diseases or recurrence of infections. Historically, many models for recurrent events have been...","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138539638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jordache Ramjith, Andreas Bender, Kit C. B. Roes, Marianne A. Jonker
{"title":"Recurrent events analysis with piece-wise exponential additive mixed models","authors":"Jordache Ramjith, Andreas Bender, Kit C. B. Roes, Marianne A. Jonker","doi":"10.1177/1471082x221117612","DOIUrl":"https://doi.org/10.1177/1471082x221117612","url":null,"abstract":"Recurrent events analysis plays an important role in many applications, including the study of chronic diseases or recurrence of infections. Historically, many models for recurrent events have been...","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138539549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julien Gibaud, X. Bry, C. Trottier, F. Mortier, M. Réjou‐Méchain
{"title":"Response mixture models based on supervised components: Clustering floristic taxa","authors":"Julien Gibaud, X. Bry, C. Trottier, F. Mortier, M. Réjou‐Méchain","doi":"10.1177/1471082x221115525","DOIUrl":"https://doi.org/10.1177/1471082x221115525","url":null,"abstract":"In this article, we propose to cluster responses in order to identify groups predicted by specific explanatory components. A response matrix is assumed to depend on a set of explanatory variables and a set of additional covariates. Explanatory variables are supposed many and redundant, which implies some dimension reduction and regularization. By contrast, additional covariates contain few selected variables which are forced into the regression model, as they demand no regularization. The response matrix is assumed partitioned into several unknown groups of responses. We suppose that the responses in each group are predictable from an appropriate number of specific orthogonal supervised components of explanatory variables. The classification is based on a mixture model of the responses. To estimate the model, we propose a criterion extending that of Supervised Component-based Generalized Linear Regression, a Partial Least Squares-type method, and develop an algorithm combining component-based model and Expectation Maximization estimation. This new methodology is tested on simulated data and then applied to a floristic ecology dataset.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41967369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A model for space-time threshold exceedances with an application to extreme rainfall","authors":"P. Bortot, C. Gaetan","doi":"10.1177/1471082x221098224","DOIUrl":"https://doi.org/10.1177/1471082x221098224","url":null,"abstract":"In extreme value studies, models for observations exceeding a fixed high threshold have the advantage of exploiting the available extremal information while avoiding bias from low values. In the context of space-time data, the challenge is to develop models for threshold exceedances that account for both spatial and temporal dependence. We address this issue through a modelling approach that embeds spatial dependence within a time series formulation. The model allows for different forms of limiting dependence in the spatial and temporal domains as the threshold level increases. In particular, temporal asymptotic independence is assumed, as this is often supported by empirical evidence, especially in environmental applications, while both asymptotic dependence and asymptotic independence are considered for the spatial domain. Inference from the observed exceedances is carried out through a combination of pairwise likelihood and a censoring mechanism. For those model specifications for which direct maximization of the censored pairwise likelihood is unfeasible, we propose an indirect inference procedure which leads to satisfactory results in a simulation study. The approach is applied to a dataset of rainfall amounts recorded over a set of weather stations in the North Brabant province of the Netherlands. The application shows that the range of extremal patterns that the model can cover is wide and that it has a competitive performance with respect to an alternative existing model for space-time threshold exceedances.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46124461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicoletta D’Angelo, David Payares, G. Adelfio, J. Mateu
{"title":"Self-exciting point process modelling of crimes on linear networks","authors":"Nicoletta D’Angelo, David Payares, G. Adelfio, J. Mateu","doi":"10.1177/1471082x221094146","DOIUrl":"https://doi.org/10.1177/1471082x221094146","url":null,"abstract":"Although there are recent developments for the analysis of first and second-order characteristics of point processes on networks, there are very few attempts in introducing models for network data. Motivated by the analysis of crime data in Bucaramanga (Colombia), we propose a spatiotemporal Hawkes point process model adapted to events living on linear networks. We first consider a non-parametric modelling strategy, for which we follow a non-parametric estimation of both the background and the triggering components. Then we consider a semi-parametric version, including a parametric estimation of the background based on covariates, and a non-parametric one of the triggering effects. Our model can be easily adapted to multi-type processes. Our network model outperforms a planar version, improving the fitting of the self-exciting point process model.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46969696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian discrete conditional transformation models","authors":"Manuel Carlan, T. Kneib","doi":"10.1177/1471082x221114177","DOIUrl":"https://doi.org/10.1177/1471082x221114177","url":null,"abstract":"We propose a novel Bayesian model framework for discrete ordinal and count data based on conditional transformations of the responses. The conditional transformation function is estimated from the data in conjunction with an a priori chosen reference distribution. For count responses, the resulting transformation model is novel in the sense that it is a Bayesian fully parametric yet distribution-free approach that can additionally account for excess zeros with additive transformation function specifications. For ordinal categoric responses, our cumulative link transformation model allows the inclusion of linear and non-linear covariate effects that can additionally be made category-specific, resulting in (non-)proportional odds or hazards models and more, depending on the choice of the reference distribution. Inference is conducted by a generic modular Markov chain Monte Carlo algorithm where multivariate Gaussian priors enforce specific properties such as smoothness on the functional effects. To illustrate the versatility of Bayesian discrete conditional transformation models, applications to counts of patent citations in the presence of excess zeros and on treating forest health categories in a discrete partial proportional odds model are presented.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46689765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interpretable modelling of retail demand and price elasticity for passenger flights using booking data","authors":"Jan Felix Meyer, Goeran Kauermann, M. Smith","doi":"10.1177/1471082x221083343","DOIUrl":"https://doi.org/10.1177/1471082x221083343","url":null,"abstract":"We propose a model of retail demand for air travel and ticket price elasticity at the daily booking and individual flight level. Daily bookings are modelled as a non-homogeneous Poisson process with respect to the time to departure. The booking intensity is a function of booking and flight level covariates, including non-linear effects modelled semi-parametrically using penalized splines. Customer heterogeneity is incorporated using a finite mixture model, where the latent segments have covariate-dependent probabilities. We fit the model to a unique dataset of over one million daily counts of bookings for 9 602 scheduled flights on a short-haul route over two years. A control variate approach with a strong instrument corrects for a substantial level of price endogeneity. A rich latent segmentation is uncovered, along with strong covariate effects. The calibrated model can be used to quantify demand and price elasticity for different flights booked on different days prior to departure and is a step towards continuous pricing; something that is a major objective of airlines. As our model is interpretable, forecasts can be created under different scenarios. For instance, while our model is calibrated on data collected prior to COVID-19, many of the empirical insights are likely to remain valid as air travel recovers in the post-COVID-19 period.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43450403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Lasso and adaptive Lasso for non-random sample in credit scoring","authors":"E. Ogundimu","doi":"10.1177/1471082x221092181","DOIUrl":"https://doi.org/10.1177/1471082x221092181","url":null,"abstract":"Prediction models in credit scoring are often formulated using available data on accepted applicants at the loan application stage. The use of this data to estimate probability of default (PD) may lead to bias due to non-random selection from the population of applicants. That is, the PD in the general population of applicants may not be the same with the PD in the subpopulation of the accepted applicants. A prominent model for the reduction of bias in this framework is the sample selection model, but there is no consensus on its utility yet. It is unclear if the bias-variance trade- off of regularization techniques can improve the predictions of PD in non-random sample selection setting. To address this, we propose the use of Lasso and adaptive Lasso for variable selection and optimal predictive accuracy. By appealing to the least square approximation of the likelihood function of sample selection model, we optimize the resulting function subject to L1 and adaptively weighted L1 penalties using an efficient algorithm. We evaluate the performance of the proposed approach and competing alternatives in a simulation study and applied it to the well-known American Express credit card dataset.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46767855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}