Michael Dumelle, Tom Kincaid, Anthony R Olsen, Marc Weber
{"title":"spsurvey: Spatial Sampling Design and Analysis in R.","authors":"Michael Dumelle, Tom Kincaid, Anthony R Olsen, Marc Weber","doi":"10.18637/jss.v105.i03","DOIUrl":"10.18637/jss.v105.i03","url":null,"abstract":"<p><p><b>spsurvey</b> is an R package for design-based statistical inference, with a focus on spatial data. <b>spsurvey</b> provides the generalized random-tessellation stratified (GRTS) algorithm to select spatially balanced samples via the grts() function. The grts() function flexibly accommodates several sampling design features, including stratification, varying inclusion probabilities, legacy (or historical) sites, minimum distances between sites, and two options for replacement sites. <b>spsurvey</b> also provides a suite of data analysis options, including categorical variable analysis (cat_analysis()), continuous variable analysis cont_analysis()), relative risk analysis (relrisk_analysis()), attributable risk analysis (attrisk_analysis()), difference in risk analysis (diffrisk_analysis()), change analysis (change_analysis()), and trend analysis (trend_analysis()). In this manuscript, we first provide background for the GRTS algorithm and the analysis approaches and then show how to implement them in <b>spsurvey</b>. We find that the spatially balanced GRTS algorithm yields more precise parameter estimates than simple random sampling, which ignores spatial information.</p>","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"105 3","pages":"1-29"},"PeriodicalIF":5.8,"publicationDate":"2023-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9926341/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10747251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of Equal Local Levels to Improve Q-Q Plot Testing Bands with R Package qqconf.","authors":"Eric Weine, Mary Sara McPeek, Mark Abney","doi":"10.18637/jss.v106.i10","DOIUrl":"https://doi.org/10.18637/jss.v106.i10","url":null,"abstract":"<p><p>Quantile-Quantile (Q-Q) plots are often difficult to interpret because it is unclear how large the deviation from the theoretical distribution must be to indicate a lack of fit. Most Q-Q plots could benefit from the addition of meaningful global testing bands, but the use of such bands unfortunately remains rare because of the drawbacks of current approaches and packages. These drawbacks include incorrect global Type I error rate, lack of power to detect deviations in the tails of the distribution, relatively slow computation for large data sets, and limited applicability. To solve these problems, we apply the equal local levels global testing method, which we have implemented in the R Package <b>qqconf</b>, a versatile tool to create Q-Q plots and probability-probability (P-P) plots in a wide variety of settings, with simultaneous testing bands rapidly created using recently-developed algorithms. <b>qqconf</b> can easily be used to add global testing bands to Q-Q plots made by other packages. In addition to being quick to compute, these bands have a variety of desirable properties, including accurate global levels, equal sensitivity to deviations in all parts of the null distribution (including the tails), and applicability to a range of null distributions. We illustrate the use of <b>qqconf</b> in several applications: assessing normality of residuals from regression, assessing accuracy of <i>p</i> values, and use of Q-Q plots in genome-wide association studies.</p>","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"106 10","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10193564/pdf/nihms-1890451.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9497381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Elastic Net Regularization Paths for All Generalized Linear Models.","authors":"J Kenneth Tay, Balasubramanian Narasimhan, Trevor Hastie","doi":"10.18637/jss.v106.i01","DOIUrl":"10.18637/jss.v106.i01","url":null,"abstract":"<p><p>The lasso and elastic net are popular regularized regression models for supervised learning. Friedman, Hastie, and Tibshirani (2010) introduced a computationally efficient algorithm for computing the elastic net regularization path for ordinary least squares regression, logistic regression and multinomial logistic regression, while Simon, Friedman, Hastie, and Tibshirani (2011) extended this work to Cox models for right-censored data. We further extend the reach of the elastic net-regularized regression to all generalized linear model families, Cox models with (start, stop] data and strata, and a simplified version of the relaxed lasso. We also discuss convenient utility functions for measuring the performance of these fitted models.</p>","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"106 ","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10153598/pdf/nihms-1843576.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9776933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bob Carpenter, Andrew Gelman, Matthew D Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus A Brubaker, Jiqiang Guo, Peter Li, Allen Riddell
{"title":"Stan: A Probabilistic Programming Language.","authors":"Bob Carpenter, Andrew Gelman, Matthew D Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus A Brubaker, Jiqiang Guo, Peter Li, Allen Riddell","doi":"10.18637/jss.v076.i01","DOIUrl":"https://doi.org/10.18637/jss.v076.i01","url":null,"abstract":"<p><p>Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectation propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can be called from the command line using the <b>cmdstan</b> package, through R using the <b>rstan</b> package, and through Python using the <b>pystan</b> package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. <b>rstan</b> and <b>pystan</b> also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.</p>","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"76 ","pages":""},"PeriodicalIF":5.8,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9788645/pdf/nihms-1811392.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10435856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}