Michael Dumelle, Tom Kincaid, Anthony R Olsen, Marc Weber
{"title":"spsurvey: Spatial Sampling Design and Analysis in R.","authors":"Michael Dumelle, Tom Kincaid, Anthony R Olsen, Marc Weber","doi":"10.18637/jss.v105.i03","DOIUrl":"10.18637/jss.v105.i03","url":null,"abstract":"<p><p><b>spsurvey</b> is an R package for design-based statistical inference, with a focus on spatial data. <b>spsurvey</b> provides the generalized random-tessellation stratified (GRTS) algorithm to select spatially balanced samples via the grts() function. The grts() function flexibly accommodates several sampling design features, including stratification, varying inclusion probabilities, legacy (or historical) sites, minimum distances between sites, and two options for replacement sites. <b>spsurvey</b> also provides a suite of data analysis options, including categorical variable analysis (cat_analysis()), continuous variable analysis cont_analysis()), relative risk analysis (relrisk_analysis()), attributable risk analysis (attrisk_analysis()), difference in risk analysis (diffrisk_analysis()), change analysis (change_analysis()), and trend analysis (trend_analysis()). In this manuscript, we first provide background for the GRTS algorithm and the analysis approaches and then show how to implement them in <b>spsurvey</b>. We find that the spatially balanced GRTS algorithm yields more precise parameter estimates than simple random sampling, which ignores spatial information.</p>","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9926341/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10747251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"<b>hdpGLM</b>: An <i>R</i> Package to Estimate Heterogeneous Effects in Generalized Linear Models Using Hierarchical Dirichlet Process","authors":"Diogo Ferrari","doi":"10.18637/jss.v107.i10","DOIUrl":"https://doi.org/10.18637/jss.v107.i10","url":null,"abstract":"The existence of latent clusters with different responses to a treatment is a major concern in scientific research, as latent effect heterogeneity often emerges due to latent or unobserved features - e.g., genetic characteristics, personality traits, or hidden motivations - of the subjects. Conventional random- and fixed-effects methods cannot be applied to that heterogeneity if the group markers associated with that heterogeneity are latent or unobserved. Alternative methods that combine regression models and clustering procedures using Dirichlet process are available, but these methods are complex to implement, especially for non-linear regression models with discrete or binary outcomes. This article discusses the R package hdpGLM as a means of implementing a novel hierarchical Dirichlet process approach to estimate mixtures of generalized linear models outlined in Ferrari (2020). The methods implemented make it easy for researchers to investigate heterogeneity in the effect of treatment or background variables and identify clusters of subjects with differential effects. This package provides several features for out-of-the-box estimation and to generate numerical summaries and visualizations of the results. A comparison with other similar R packages is provided.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136367628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of Equal Local Levels to Improve Q-Q Plot Testing Bands with R Package qqconf.","authors":"Eric Weine, Mary Sara McPeek, Mark Abney","doi":"10.18637/jss.v106.i10","DOIUrl":"https://doi.org/10.18637/jss.v106.i10","url":null,"abstract":"<p><p>Quantile-Quantile (Q-Q) plots are often difficult to interpret because it is unclear how large the deviation from the theoretical distribution must be to indicate a lack of fit. Most Q-Q plots could benefit from the addition of meaningful global testing bands, but the use of such bands unfortunately remains rare because of the drawbacks of current approaches and packages. These drawbacks include incorrect global Type I error rate, lack of power to detect deviations in the tails of the distribution, relatively slow computation for large data sets, and limited applicability. To solve these problems, we apply the equal local levels global testing method, which we have implemented in the R Package <b>qqconf</b>, a versatile tool to create Q-Q plots and probability-probability (P-P) plots in a wide variety of settings, with simultaneous testing bands rapidly created using recently-developed algorithms. <b>qqconf</b> can easily be used to add global testing bands to Q-Q plots made by other packages. In addition to being quick to compute, these bands have a variety of desirable properties, including accurate global levels, equal sensitivity to deviations in all parts of the null distribution (including the tails), and applicability to a range of null distributions. We illustrate the use of <b>qqconf</b> in several applications: assessing normality of residuals from regression, assessing accuracy of <i>p</i> values, and use of Q-Q plots in genome-wide association studies.</p>","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10193564/pdf/nihms-1890451.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9497381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"<b>drda</b>: An <i>R</i> Package for Dose-Response Data Analysis Using Logistic Functions","authors":"Alina Malyutina, Jing Tang, Alberto Pessia","doi":"10.18637/jss.v106.i04","DOIUrl":"https://doi.org/10.18637/jss.v106.i04","url":null,"abstract":"","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135126751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"<b>ARCHModels.jl</b>: Estimating ARCH Models in <i>Julia</i>","authors":"Simon A. Broda, Marc S. Paolella","doi":"10.18637/jss.v107.i05","DOIUrl":"https://doi.org/10.18637/jss.v107.i05","url":null,"abstract":"This paper introduces ARCHModels.jl, a package for the Julia programming language that implements a number of univariate and multivariate autoregressive conditional heteroskedasticity models. This model class is the workhorse tool for modeling the conditional volatility of financial assets. The distinguishing feature of these models is that they model the latent volatility as a (deterministic) function of past returns and volatilities. This recursive structure results in loop-heavy code which, due to its just-in-time compiler, Julia is well-equipped to handle. As such, the entire package is written in Julia, without any binary dependencies. We benchmark the performance of ARCHModels.jl against popular implementations in MATLAB, R, and Python, and illustrate its use in a detailed case study.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135653274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raluca Gui, Markus Meierer, Patrik Schilter, René Algesheimer
{"title":"<b>REndo</b>: Internal Instrumental Variables to Address Endogeneity","authors":"Raluca Gui, Markus Meierer, Patrik Schilter, René Algesheimer","doi":"10.18637/jss.v107.i03","DOIUrl":"https://doi.org/10.18637/jss.v107.i03","url":null,"abstract":"Endogeneity is a common problem in any causal analysis. It arises when the independence assumption between an explanatory variable and the error in a statistical model is violated. The causes of endogeneity are manifold and include response bias in surveys, omission of important explanatory variables, or simultaneity between explanatory and response variables. Instrumental variable estimation provides a possible solution. However, valid and strong external instruments are difficult to find. Consequently, internal instrumental variable approaches have been proposed to correct for endogeneity without relying on external instruments. The R package REndo implements various internal instrumental variable approaches, i.e., latent instrumental variables estimation (Ebbes, Wedel, Boeckenholt, and Steerneman 2005), higher moments estimation (Lewbel 1997), heteroscedastic error estimation (Lewbel 2012), joint estimation using copula (Park and Gupta 2012) and multilevel generalized method of moments estimation (Kim and Frees 2007). Package usage is illustrated on simulated and real-world data.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135312282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Balduzzi, G. Rücker, A. Nikolakopoulou, T. Papakonstantinou, G. Salanti, O. Efthimiou, G. Schwarzer
{"title":"netmeta: An R Package for Network Meta-Analysis Using Frequentist Methods","authors":"S. Balduzzi, G. Rücker, A. Nikolakopoulou, T. Papakonstantinou, G. Salanti, O. Efthimiou, G. Schwarzer","doi":"10.18637/jss.v106.i02","DOIUrl":"https://doi.org/10.18637/jss.v106.i02","url":null,"abstract":"Network meta-analysis compares different interventions for the same condition, by combining direct and indirect evidence derived from all eligible studies. Network meta-analysis has been increasingly used by applied scientists and it is a major research topic for methodologists. This article describes the R package netmeta , which adopts frequentist methods to fit network meta-analysis models. We provide a roadmap to perform network meta-analysis, along with an overview of the main functions of the package. We present three worked examples considering different types of outcomes and different data formats to facilitate researchers aiming to conduct network meta-analysis with netmeta","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90558206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RecordTest: An R Package to Analyze Non-Stationarity in the Extremes Based on Record-Breaking Events","authors":"Jorge Castillo-Mateo, A. Cebrián, J. Asín","doi":"10.18637/jss.v106.i05","DOIUrl":"https://doi.org/10.18637/jss.v106.i05","url":null,"abstract":"The study of non-stationary behavior in the extremes is important to analyze data in environmental sciences, climate, finance, or sports. As an alternative to the classical extreme value theory, this analysis can be based on the study of record-breaking events. The R package RecordTest provides a useful framework for non-parametric analysis of non-stationary behavior in the extremes, based on the analysis of records. The underlying idea of all the non-parametric tools implemented in the package is to use the distribution of the record occurrence under series of independent and identically distributed continuous random variables, to analyze if the observed records are compatible with that behavior. Two families of tests are implemented. The first only requires the record times of the series, while the second includes more powerful tests that join the information from different types of records: upper and lower records in the forward and backward series. The package also offers functions that cover all the steps in this type of analysis such as data preparation, identification of the records, exploratory analysis, and complementary graphical tools. The applicability of the package is illustrated with the analysis of the effect of global warming on the extremes of the daily maximum temperature series in Zaragoza, Spain.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89507792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gilles Kratzer, F. Lewis, A. Comin, M. Pittavino, R. Furrer
{"title":"Additive Bayesian Network Modeling with the R Package abn","authors":"Gilles Kratzer, F. Lewis, A. Comin, M. Pittavino, R. Furrer","doi":"10.18637/jss.v105.i08","DOIUrl":"https://doi.org/10.18637/jss.v105.i08","url":null,"abstract":"The R package abn is designed to fit additive Bayesian network models to observational datasets and contains routines to score Bayesian networks based on Bayesian or information theoretic formulations of generalized linear models. It is equipped with exact search and greedy search algorithms to select the best network, and supports continuous, discrete and count data in the same model and input of prior knowledge at a structural level. The Bayesian implementation supports random effects to control for one-layer clustering. In this paper, we give an overview of the methodology and illustrate the package’s functionality using a veterinary dataset concerned with respiratory diseases in commercial swine production.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86781577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"jumpdiff: A Python Library for Statistical Inference of Jump-Diffusion Processes in Observational or Experimental Data Sets","authors":"L. R. Gorjão, D. Witthaut, P. Lind","doi":"10.18637/jss.v105.i04","DOIUrl":"https://doi.org/10.18637/jss.v105.i04","url":null,"abstract":"We introduce a Python library, called jumpdiff , which includes all necessary functions to assess jump-diffusion processes. This library includes functions which compute a set of non-parametric estimators of all contributions composing a jump-diffusion process, namely the drift, the diffusion, and the stochastic jump strengths. Having a set of measurements from a jump-diffusion process, jumpdiff is able to retrieve the evolution equation producing data series statistically equivalent to the series of measurements. The back-end calculations are based on second-order corrections of the conditional moments expressed from the series of Kramers-Moyal coefficients. Additionally, the library is also able to test if stochastic jump contributions are present in the dynamics underlying a set of measurements. Finally, we introduce a simple iterative method for deriving second-order corrections of any Kramers-Moyal coefficient.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73722571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}