StatsPub Date : 2023-10-09DOI: 10.3390/stats6040065
Yuelei Sui, Scott H. Holan, Wen-Hsi Yang
{"title":"Computationally Efficient Poisson Time-Varying Autoregressive Models through Bayesian Lattice Filters","authors":"Yuelei Sui, Scott H. Holan, Wen-Hsi Yang","doi":"10.3390/stats6040065","DOIUrl":"https://doi.org/10.3390/stats6040065","url":null,"abstract":"Estimation of time-varying autoregressive models for count-valued time series can be computationally challenging. In this direction, we propose a time-varying Poisson autoregressive (TV-Pois-AR) model that accounts for the changing intensity of the Poisson process. Our approach can capture the latent dynamics of the time series and therefore make superior forecasts. To speed up the estimation of the TV-AR process, our approach uses the Bayesian Lattice Filter. In addition, the No-U-Turn Sampler (NUTS) is used, instead of a random walk Metropolis–Hastings algorithm, to sample intensity-related parameters without a closed-form full conditional distribution. The effectiveness of our approach is evaluated through model-based and empirical simulation studies. Finally, we demonstrate the utility of the proposed model through an example of COVID-19 spread in New York State and an example of US COVID-19 hospitalization data.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135094311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-10-08DOI: 10.3390/stats6040064
Letícia Ellen Dal Canton, Luciana Pagliosa Carvalho Guedes, Miguel Angel Uribe-Opazo, Tamara Cantu Maltauro
{"title":"Effective Sample Size with the Bivariate Gaussian Common Component Model","authors":"Letícia Ellen Dal Canton, Luciana Pagliosa Carvalho Guedes, Miguel Angel Uribe-Opazo, Tamara Cantu Maltauro","doi":"10.3390/stats6040064","DOIUrl":"https://doi.org/10.3390/stats6040064","url":null,"abstract":"Effective sample size (ESS) consists of an equivalent number of sampling units of a georeferenced variable that would produce the same sampling error, as it considers the information that each georeferenced sampling unit contains about itself as well as in relation to its neighboring sampling units. This measure can provide useful information in the planning of future georeferenced sampling for spatial variability experiments. The objective of this article was to develop a bivariate methodology for ESS (ESSbi), considering the bivariate Gaussian common component model (BGCCM), which accounts both for the spatial correlation between the two variables and for the individual spatial association. All properties affecting the univariate methodology were verified for ESSbi using simulation studies or algebraic methods, including scenarios to verify the impact of the BGCCM common range parameter on the estimated ESSbi values. ESSbi was applied to real organic matter (OM) and sum of bases (SB) data from an agricultural area. The study found that 60% of the sample observations of the OM–SB pair contained spatially redundant information. The reduced sample configuration proved efficient by preserving spatial variability when comparing the original and reduced OM maps, using SB as a covariate. The Tau concordance index confirmed moderate accuracy between the maps.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135251306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-10-06DOI: 10.3390/stats6040063
Tshilidzi Mulaudzi, Yehenew Kifle, Roel Braekers
{"title":"A Shared Frailty Model for Left-Truncated and Right-Censored Under-Five Child Mortality Data in South Africa","authors":"Tshilidzi Mulaudzi, Yehenew Kifle, Roel Braekers","doi":"10.3390/stats6040063","DOIUrl":"https://doi.org/10.3390/stats6040063","url":null,"abstract":"Many African nations continue to grapple with persistently high under-five child mortality rates, particularly those situated in the Sub-Saharan region, including South Africa. A multitude of socio-economic factors are identified as key contributors to the elevated under-five child mortality in numerous African nations. This research endeavors to investigate various factors believed to be associated with child mortality by employing advanced statistical models. This study utilizes child-level survival data from South Africa, characterized by left truncation and right censoring, to fit a Cox proportional hazards model under the assumption of working independence. Additionally, a shared frailty model is applied, clustering children based on their mothers. Comparative analysis is performed between the results obtained from the shared frailty model and the Cox proportional hazards model under the assumption of working independence. Within the scope of this analysis, several factors stand out as significant contributors to under-five child mortality in the study area, including gender, birth province, birth year, birth order, and twin status. Notably, the shared frailty model demonstrates superior performance in modeling the dataset, as evidenced by a lower likelihood cross-validation score compared to the Cox proportional hazards model assuming independence. This improvement can be attributed to the shared frailty model’s ability to account for heterogeneity among mothers and the inherent association between siblings born to the same mother, ultimately enhancing the quality of the study’s conclusions.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135351051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-09-29DOI: 10.3390/stats6040062
Raydonal Ospina, Jaciele Oliveira, Cristiano Ferraz, André Leite, João Gondim
{"title":"Ensemble Algorithms to Improve COVID-19 Growth Curve Estimates","authors":"Raydonal Ospina, Jaciele Oliveira, Cristiano Ferraz, André Leite, João Gondim","doi":"10.3390/stats6040062","DOIUrl":"https://doi.org/10.3390/stats6040062","url":null,"abstract":"In January 2020, the world was taken by surprise as a novel disease, COVID-19, emerged, attributed to the new SARS-CoV-2 virus. Initial cases were reported in China, and the virus rapidly disseminated globally, leading the World Health Organization (WHO) to declare it a pandemic on 11 March 2020. Given the novelty of this pathogen, limited information was available regarding its infection rate and symptoms. Consequently, the necessity of employing mathematical models to enable researchers to describe the progression of the epidemic and make accurate forecasts became evident. This study focuses on the analysis of several dynamic growth models, including the logistics, Gompertz, and Richards growth models, which are commonly employed to depict the spread of infectious diseases. These models are integrated to harness their predictive capabilities, utilizing an ensemble modeling approach. The resulting ensemble algorithm was trained using COVID-19 data from the Brazilian state of Paraíba. The proposed ensemble model approach effectively reduced forecasting errors, showcasing itself as a promising methodology for estimating COVID-19 growth curves, improving data forecasting accuracy, and providing rapid responses in the early stages of the pandemic.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135246022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Confounder Adjustment in Shape-on-Scalar Regression Model: Corpus Callosum Shape Alterations in Alzheimer’s Disease","authors":"Harshita Dogra, Shengxian Ding, Miyeon Yeon, Rongjie Liu, Chao Huang","doi":"10.3390/stats6040061","DOIUrl":"https://doi.org/10.3390/stats6040061","url":null,"abstract":"Large-scale imaging studies often face challenges stemming from heterogeneity arising from differences in geographic location, instrumental setups, image acquisition protocols, study design, and latent variables that remain undisclosed. While numerous regression models have been developed to elucidate the interplay between imaging responses and relevant covariates, limited attention has been devoted to cases where the imaging responses pertain to the domain of shape. This adds complexity to the problem of imaging heterogeneity, primarily due to the unique properties inherent to shape representations, including nonlinearity, high-dimensionality, and the intricacies of quotient space geometry. To tackle this intricate issue, we propose a novel approach: a shape-on-scalar regression model that incorporates confounder adjustment. In particular, we leverage the square root velocity function to extract elastic shape representations which are embedded within the linear Hilbert space of square integrable functions. Subsequently, we introduce a shape regression model aimed at characterizing the intricate relationship between elastic shapes and covariates of interest, all while effectively managing the challenges posed by imaging heterogeneity. We develop comprehensive procedures for estimating and making inferences about the unknown model parameters. Through real-data analysis, our method demonstrates its superiority in terms of estimation accuracy when compared to existing approaches.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135425878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-09-27DOI: 10.3390/stats6040060
Christos Stefanis, Elpida Giorgi, Giorgios Tselemponis, Chrysa Voidarou, Ioannis Skoufos, Athina Tzora, Christina Tsigalou, Yiannis Kourkoutas, Theodoros C. Constantinidis, Eugenia Bezirtzoglou
{"title":"Terroir in View of Bibliometrics","authors":"Christos Stefanis, Elpida Giorgi, Giorgios Tselemponis, Chrysa Voidarou, Ioannis Skoufos, Athina Tzora, Christina Tsigalou, Yiannis Kourkoutas, Theodoros C. Constantinidis, Eugenia Bezirtzoglou","doi":"10.3390/stats6040060","DOIUrl":"https://doi.org/10.3390/stats6040060","url":null,"abstract":"This study aimed to perform a bibliometric analysis of terroir and explore its conceptual horizons. Advancements in terroir research until 2022 were investigated using the Scopus database, R, and VOSviewer. Out of the 907 results, the most prevalent document types were articles (771) and reviews (70). The annual growth rate of published manuscripts in this field was 7.8%. The research on terroir encompassed a wide range of disciplines, with significant contributions from Agricultural and Biological Sciences, Social Sciences, Environmental Science, Biochemistry, Genetics, and Molecular Biology. Through keyword analysis, the study identified the most frequently occurring terms in titles, abstracts, and keywords fields, including ‘terroir’, ‘wine’, ‘soil’, ‘wines’, ‘grape’, ‘analysis’, ‘vineyard’, ‘composition’, and ‘climate’. A trend topic analysis revealed that research in terroir primarily focused on the geo-ecology and physiology of grapes. Furthermore, considerable attention was given to methods and techniques related to the physicochemical, sensory, and microbial characterization of terroir and various aspects of the wine industry. Initially, the research in this domain was focused on terroir, authenticity, grapevine, soils, soil moisture, and wine quality. However, over time, the research agenda expanded to include topics such as food analysis, viticulture, wine, taste, sustainability, and climate change. New research areas emerged, including phenolic compounds, anthocyanin, phenols, sensory analysis, and precision agriculture—all of which became integral components of the scientific studies on terroir. Overall, this study provided valuable insights into the historical trends and current developments in terroir research, contributing to our understanding of the frontiers in this field.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135536911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-09-18DOI: 10.3390/stats6030059
Seng Huat Ong, Shin Zhu Sim, Shuangzhe Liu, Hari M. Srivastava
{"title":"A Family of Finite Mixture Distributions for Modelling Dispersion in Count Data","authors":"Seng Huat Ong, Shin Zhu Sim, Shuangzhe Liu, Hari M. Srivastava","doi":"10.3390/stats6030059","DOIUrl":"https://doi.org/10.3390/stats6030059","url":null,"abstract":"This paper considers the construction of a family of discrete distributions with the flexibility to cater for under-, equi- and over-dispersion in count data using a finite mixture model based on standard distributions. We are motivated to introduce this family because its simple finite mixture structure adds flexibility and facilitates application and use in analysis. The family of distributions is exemplified using a mixture of negative binomial and shifted negative binomial distributions. Some basic and probabilistic properties are derived. We perform hypothesis testing for equi-dispersion and simulation studies of their power and consider parameter estimation via maximum likelihood and probability-generating-function-based methods. The utility of the distributions is illustrated via their application to real biological data sets exhibiting under-, equi- and over-dispersion. It is shown that the distribution fits better than the well-known generalized Poisson and COM–Poisson distributions for handling under-, equi- and over-dispersion in count data.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135202745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-09-18DOI: 10.3390/stats6030058
Jiecheng Song, Guanchao Tong, Wei Zhu
{"title":"A Detecting System for Abrupt Changes in Temporal Incidence Rate of COVID-19 and Other Pandemics","authors":"Jiecheng Song, Guanchao Tong, Wei Zhu","doi":"10.3390/stats6030058","DOIUrl":"https://doi.org/10.3390/stats6030058","url":null,"abstract":"COVID-19 spread dramatically across the world in the beginning of 2020. This paper presents a novel alert system that will detect abrupt changes in the COVID-19 or other pandemic incidence rate through the estimated time-varying reproduction number (Rt). We applied the system to detect abrupt changes in the COVID-19 pandemic incidence rates in thirteen world regions with eight in the US and five across the world. Subsequently, we also evaluated the system with the 2009 H1N1 pandemic in Hong Kong. Our system performs well in detecting both the abrupt increases and decreases. Users of the system can obtain accurate information on the changing trend of the pandemic to avoid being misled by low incidence numbers. The world may face other threatening pandemics in the future; therefore, it is crucial to have a reliable alert system to detect impending abrupt changes in the daily incidence rates. An added benefit of the system is its ability to detect the emergence of viral mutations, as different virus strains are likely to have different infection rates.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135202902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-09-01DOI: 10.3390/stats6030057
J. C. W. Rayner, G. C. Livingston
{"title":"Orthonormal F Contrasts for Factors with Ordered Levels in Two-Factor Fixed-Effects ANOVAs","authors":"J. C. W. Rayner, G. C. Livingston","doi":"10.3390/stats6030057","DOIUrl":"https://doi.org/10.3390/stats6030057","url":null,"abstract":"In multifactor fixed-effects ANOVAs, we show how to construct orthonormal F contrasts for main effects. Our primary focus is the case when the levels of the factor of interest are ordered. Likewise, in multifactor equally replicated fixed-effects ANOVAs, we show how to construct orthonormal F contrasts for interactions. The primary focus here is on interactions when both factors are ordered, although the approach also applies if just one factor is ordered. Interactions with both factors ordered may be interpreted in terms of generalised correlations.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49004821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StatsPub Date : 2023-08-29DOI: 10.3390/stats6030056
Fanny Rancourt, Paula Vondrlik, Diego Maupomé, Marie-Jean Meurs
{"title":"Investigating Self-Rationalizing Models for Commonsense Reasoning","authors":"Fanny Rancourt, Paula Vondrlik, Diego Maupomé, Marie-Jean Meurs","doi":"10.3390/stats6030056","DOIUrl":"https://doi.org/10.3390/stats6030056","url":null,"abstract":"The rise of explainable natural language processing spurred a bulk of work on datasets augmented with human explanations, as well as technical approaches to leverage them. Notably, generative large language models offer new possibilities, as they can output a prediction as well as an explanation in natural language. This work investigates the capabilities of fine-tuned text-to-text transfer Transformer (T5) models for commonsense reasoning and explanation generation. Our experiments suggest that while self-rationalizing models achieve interesting results, a significant gap remains: classifiers consistently outperformed self-rationalizing models, and a substantial fraction of model-generated explanations are not valid. Furthermore, training with expressive free-text explanations substantially altered the inner representation of the model, suggesting that they supplied additional information and may bridge the knowledge gap. Our code is publicly available, and the experiments were run on open-access datasets, hence allowing full reproducibility.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47029258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}