{"title":"Permutation Tests for Regression, ANOVA, and Comparison of Signals: The permuco Package","authors":"Jaromil Frossard, O. Renaud","doi":"10.18637/jss.v099.i15","DOIUrl":"https://doi.org/10.18637/jss.v099.i15","url":null,"abstract":"Recent methodological researches produced permutation methods to test parameters in presence of nuisance variables in linear models or repeated measures ANOVA. Permutation tests are also particularly useful to overcome the multiple comparisons problem as they are used to test the effect of factors or variables on signals while controlling the family-wise error rate (FWER). This article introduces the permuco package which implements several permutation methods. They can all be used jointly with multiple comparisons procedures like the cluster-mass tests or threshold-free cluster enhancement (TFCE). The permuco package is designed, first, for univariate permutation tests with nuisance variables, like regression and ANOVA; and secondly, for comparing signals as required, for example, for the analysis of event-related potential (ERP) of experiments using electroencephalography (EEG). This article describes the permutation methods and the multiple comparisons procedures implemented. A tutorial for each of theses cases is provided.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87622264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"calculus: High Dimensional Numerical and Symbolic Calculus in R","authors":"E. Guidotti","doi":"10.18637/jss.v104.i05","DOIUrl":"https://doi.org/10.18637/jss.v104.i05","url":null,"abstract":"The R package calculus implements C++ optimized functions for numerical and symbolic calculus, such as the Einstein summing convention, fast computation of the Levi-Civita symbol and generalized Kronecker delta, Taylor series expansion, multivariate Hermite polynomials, high-order derivatives, ordinary differential equations, differential operators and numerical integration in arbitrary orthogonal coordinate systems. The library applies numerical methods when working with R functions or symbolic programming when working with characters or expressions. The package handles multivariate numerical calculus in arbitrary dimensions and coordinates and implements the symbolic counterpart of the numerical methods whenever possible, without depending on external computer algebra systems. Except for Rcpp, the package has no strict dependencies in order to provide a stable self-contained toolbox that invites re-use.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2020-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87161413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tom'as Capretto, Camen Piho, Ravi Kumar, Jacob Westfall, T. Yarkoni, O. A. Martin
{"title":"Bambi: A Simple Interface for Fitting Bayesian Linear Models in Python","authors":"Tom'as Capretto, Camen Piho, Ravi Kumar, Jacob Westfall, T. Yarkoni, O. A. Martin","doi":"10.18637/jss.v103.i15","DOIUrl":"https://doi.org/10.18637/jss.v103.i15","url":null,"abstract":"The popularity of Bayesian statistical methods has increased dramatically in recent years across many research areas and industrial applications. This is the result of a variety of methodological advances with faster and cheaper hardware as well as the development of new software tools. Here we introduce an open source Python package named Bambi (BAyesian Model Building Interface) that is built on top of the PyMC probabilistic programming framework and the ArviZ package for exploratory analysis of Bayesian models. Bambi makes it easy to specify complex generalized linear hierarchical models using a formula notation similar to those found in R. We demonstrate Bambi's versatility and ease of use with a few examples spanning a range of common statistical models including multiple regression, logistic regression, and mixed-effects modeling with crossed group specific effects. Additionally we discuss how automatic priors are constructed. Finally, we conclude with a discussion of our plans for the future development of Bambi.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2020-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73026084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Continuous Ordinal Regression for Analysis of Visual Analogue Scales: The R Package ordinalCont","authors":"M. Manuguerra, G. Heller, Jun Ma","doi":"10.18637/jss.v096.i08","DOIUrl":"https://doi.org/10.18637/jss.v096.i08","url":null,"abstract":"This paper introduces the R package ordinalCont, which implements an ordinal regression framework for response variables which are recorded on a visual analogue scale (VAS). This scale is used when recording subjects' perception of an intangible quantity such as pain, anxiety or quality of life, and consists of a mark made on a linear scale. We implement continuous ordinal regression models for VAS as the appropriate method of analysis for such responses, and introduce smoothing terms and random effects in the linear predictor. The model parameters are estimated using constrained optimization of the penalized likelihood and the penalty parameters are automatically selected via maximization of their marginal likelihood. The estimation algorithm is shown to perform well, in a simulation study. Two examples of application are given: the first involves the analysis of pain outcomes in a clinical trial for laser treatment for chronic neck pain; the second is an analysis of quality of life outcomes in a clinical trial for chemotherapy for the treatment of breast cancer.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2020-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82063253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"fastnet: An R Package for Fast Simulation and Analysis of Large-Scale Social Networks","authors":"Xu Dong, Luis E. Castro, N. I. Shaikh","doi":"10.2139/ssrn.3121725","DOIUrl":"https://doi.org/10.2139/ssrn.3121725","url":null,"abstract":"Traditional tools and software for social network analysis are seldom scalable and/or fast. This paper provides an overview of an R package called fastnet, a tool for scaling and speeding up the simulation and analysis of large-scale social networks. fastnet uses multi-core processing and sub-graph sampling algorithms to achieve the desired scale-up and speed-up. Simple examples, usages, and comparisons of scale-up and speed-up as compared to other R packages, i.e., igraph and statnet, are presented.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2020-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68563997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generating Optimal Designs for Discrete Choice Experiments in R: The idefix Package","authors":"Frits Traets, Danielle Sanchez, M. Vandebroek","doi":"10.18637/jss.v096.i03","DOIUrl":"https://doi.org/10.18637/jss.v096.i03","url":null,"abstract":"Discrete choice experiments are widely used in a broad area of research fields to capture the preference structure of respondents. The design of such experiments will determine to a large extent the accuracy with which the preference parameters can be estimated. This paper presents a new R package, called idefix, which enables users to generate optimal designs for discrete choice experiments. Besides Bayesian D-efficient designs for the multinomial logit model, the package includes functions to generate Bayesian adaptive designs which can be used to gather data for the mixed logit model. In addition, the package provides the necessary tools to set up actual surveys and collect empirical data. After data collection, idefix can be used to transform the data into the necessary format in order to use existing estimation software in R.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84399722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performing Parallel Monte Carlo and Moment Equations Methods for Itô and Stratonovich Stochastic Differential Systems: R Package Sim.DiffProc","authors":"A. Guidoum, Kamal Boukhetala","doi":"10.18637/jss.v096.i02","DOIUrl":"https://doi.org/10.18637/jss.v096.i02","url":null,"abstract":"We introduce Sim.DiffProc, an R package for symbolic and numerical computations on scalar and multivariate systems of stochastic differential equations (SDEs). It provides users with a wide range of tools to simulate, estimate, analyze, and visualize the dynamics of these systems in both forms, Ito and Stratonovich. One of Sim.DiffProc key features is to implement the Monte Carlo method for the iterative evaluation and approximation of an interesting quantity at a fixed time on SDEs with parallel computing, on multiple processors on a single machine or a cluster of computers, which is an important tool to improve capacity and speed-up calculations. We also provide an easy-to-use interface for symbolic calculation and numerical approximation of the first and central second-order moments of SDEs (i.e., mean, variance and covariance), by solving a system of ordinary differential equations, which yields insights into the dynamics of stochastic systems. The final result object of Monte Carlo and moment equations can be derived and presented in terms of LATEX math expressions and visualized in terms of LATEX tables. Furthermore, we illustrate various features of the package by proposing a general bivariate nonlinear dynamic system of Haken-Zwanzig, driven by additive, linear and nonlinear multiplicative noises. In addition, we consider the particular case of a scalar SDE driven by three independent Wiener processes. The Monte Carlo simulation thereof is obtained through a transformation to a system of three equations. We also study some important applications of SDEs in different fields.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75809179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Emily, Nicolas Sounac, F. Kroell, Magalie Houée-Bigot
{"title":"Gene-Based Methods to Detect Gene-Gene Interaction in R: The GeneGeneInteR Package","authors":"M. Emily, Nicolas Sounac, F. Kroell, Magalie Houée-Bigot","doi":"10.18637/jss.v095.i12","DOIUrl":"https://doi.org/10.18637/jss.v095.i12","url":null,"abstract":"GeneGeneInteR is an R package dedicated to the detection of an association between a case-control phenotype and the interaction between two sets of biallelic markers (single nucleotide polymorphisms or SNPs) in case-control genome-wide associations studies. The development of statistical procedures for searching gene-gene interaction at the SNP-set level has indeed recently grown in popularity as these methods confer advantage in both statistical power and biological interpretation. However, all these methods have been implemented in home made softwares that are for most of them available only on request to the authors and at best have a web interface. Since the implementation of these methods is not straightforward, there is a need for a user-friendly tool to perform gene-based genegene interaction. The purpose of GeneGeneInteR is to propose a collection of tools for all the steps involved in gene-based gene-gene interaction testing in case-control association studies. Illustrated by an example of a dataset related to rheumatoid arthritis, this paper details the implementation of the functions available in GeneGeneInteR to perform an analysis of a collection of SNP sets. Such an analysis aims at addressing the complete statistical pipeline going from data importation to the visualization of the results through data manipulation and statistical analysis.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2020-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77796451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"survHE: Survival Analysis for Health Economic Evaluation and Cost-Effectiveness Modeling","authors":"G. Baio","doi":"10.18637/jss.v095.i14","DOIUrl":"https://doi.org/10.18637/jss.v095.i14","url":null,"abstract":"Survival analysis features heavily as an important part of health economic evaluation, an increasingly important component of medical research. In this setting, it is important to estimate the mean time to the survival endpoint using limited information (typically from randomized trials) and thus it is useful to consider parametric survival models. In this paper, we review the features of the R package survHE, specifically designed to wrap several tools to perform survival analysis for economic evaluation. In particular, survHE embeds both a standard, frequentist analysis (through the R package flexsurv) and a Bayesian approach, based on Hamiltonian Monte Carlo (via the R package rstan) or integrated nested Laplace approximation (with the R package INLA). Using this composite approach, we obtain maximum flexibility and are able to pre-compile a wide range of parametric models, with a view of simplifying the modelers' work and allowing them to move away from non-optimal work flows, including spreadsheets (e.g., Microsoft Excel).","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2020-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76911960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pseudo-Ranks: How to Calculate Them Efficiently in R","authors":"Martin Happ, G. Zimmermann, E. Brunner, A. Bathke","doi":"10.18637/jss.v095.c01","DOIUrl":"https://doi.org/10.18637/jss.v095.c01","url":null,"abstract":"Many popular nonparametric inferential methods are based on ranks. Among the most commonly used and most famous tests are for example the Wilcoxon-Mann-Whitney test for two independent samples, and the Kruskal-Wallis test for multiple independent groups. However, recently, it has become clear that the use of ranks may lead to paradoxical results in case of more than two groups. Luckily, these problems can be avoided simply by using pseudo-ranks instead of ranks. These pseudo-ranks, however, suffer from being (a) at first less intuitive and not as straightforward in their interpretation, (b) computationally much more expensive to calculate. The computational cost has been prohibitive, for example, for large-scale simulative evaluations or application of resampling-based pseudorank procedures. In this paper, we provide different algorithms to calculate pseudo-ranks efficiently in order to solve problem (b) and thus render it possible to overcome the current limitations of procedures based on pseudo-ranks.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2020-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84770972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}