R J.最新文献_第9页

PASSED: Calculate Power and Sample Size for Two Sample Tests 通过:计算两个样本测试的功率和样本量

R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-094

Jinpu Li, R. Knigge, Kaiyi Chen, E. Leary

引用次数: 3

pdynmc: A Package for Estimating Linear Dynamic Panel Data Models Based on Nonlinear Moment Conditions pdynmc:一个基于非线性力矩条件的线性动态面板数据模型估计包

R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-035

Markus Fritsch, Adrian Yu Pua Andrew, Joachim Schnurbus

引用次数: 6

Rejoinder: Software Engineering and R Programming 答辩:软件工程和R编程

R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-112

M. Vidoni

引用次数: 0

Reproducible Summary Tables with the gtsummary Package 使用gtsummary包可重复的汇总表

R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-053

D. Sjoberg, Karissa A Whiting, Michael Curry, J. Lavery, J. Larmarange

引用次数: 209

DChaos: An R Package for Chaotic Time Series Analysis 混沌时间序列分析的R包

R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-036

Julio E. Sandubete, L. Escot

{"title":"DChaos: An R Package for Chaotic Time Series Analysis","authors":"Julio E. Sandubete, L. Escot","doi":"10.32614/rj-2021-036","DOIUrl":"https://doi.org/10.32614/rj-2021-036","url":null,"abstract":"Chaos theory has been hailed as a revolution of thoughts and attracting ever-increasing attention of many scientists from diverse disciplines. Chaotic systems are non-linear deterministic dynamic systems which can behave like an erratic and apparently random motion. A relevant field inside chaos theory is the detection of chaotic behavior from empirical time-series data. One of the main features of chaos is the well-known initial-value sensitivity property. Methods and techniques related to testing the hypothesis of chaos try to quantify the initial-value sensitive property estimating the so-called Lyapunov exponents. This paper describes the main estimation methods of the Lyapunov exponent from time series data. At the same time, we present the DChaos library. R users may compute the delayed-coordinate embedding vector from time series data, estimates the best-fitted neural net model from the delayed-coordinate embedding vectors, calculates analytically the partial derivatives from the chosen neural nets model. They can also obtain the neural net estimator of the Lyapunov exponent from the partial derivatives computed previously by two different procedures and four ways of subsampling by blocks. To sum up, the DChaos package allows the R users to test robustly the hypothesis of chaos in order to know if the data-generating process behind time series behaves chaotically or not. The package’s functionality is illustrated by examples.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"30 1","pages":"232"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88762793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

SEEDCCA: An Integrated R-Package for Canonical Correlation Analysis and Partial Least Squares 典型相关分析和偏最小二乘的集成r包

R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-026

Boyoung Kim, Yunju Im, Keun Yoo Jae

{"title":"SEEDCCA: An Integrated R-Package for Canonical Correlation Analysis and Partial Least Squares","authors":"Boyoung Kim, Yunju Im, Keun Yoo Jae","doi":"10.32614/rj-2021-026","DOIUrl":"https://doi.org/10.32614/rj-2021-026","url":null,"abstract":"Canonical correlation analysis (CCA) has a long history as an explanatory statistical method in high-dimensional data analysis and has been successfully applied in many science fields such as chemomtrics, pattern recognition, genomic sequence analysis and so on. The so-called seedCCA is a newly developed R package, and it implements not only the standard and seeded CCA but also partial least squares. The package enables us to fit CCA to large-p and small-n data. The paper provides a complete guide. Also, the seeded CCA application results are compared with the regularized CCA in the existing R package. It is believed that the package along with the paper will contribute to highdimensional data analysis in various science field practitioners and that the statistical methodologies in multivariate analysis become more fruitful. Introduction Explanatory studies are important to identify patterns and special structure in data prior to develop a specific model. When a study between two sets of a p-dimensional random variables X (X ∈ Rp) and a r-dimensional random variable Y (Y ∈ Rr), are of primary interest, one of the popular explanatory statistical methods would be canonical correlation analysis (CCA; Hotelling (1936)). The main goal of CCA is the dimension reduction of two sets of variables by measuring an association between the two sets. For this, pairs of linear combinations of variables are constructed by maximizing the Pearson correlation. The CCA has successful application in many science fields such as chemomtrics, pattern recognition, genomic sequence analysis and so on. In Lee and Yoo (2014) it is shown that the CCA can be used as a dimension reduction tool for high-dimensional data, but also it is connected to least square estimator. Therefore, the CCA is not only explanatory and dimension reduction method but also can be utilized as alternative of least square estimation. If max(p, r) is bigger than or equal to the sample size, n, usual CCA application is not plausible due to no incapability of inverting sample covariance matrices. To overcome this, a regularized CCA is developed by Leurgans et al. (1993), whose idea was firstly suggested in Vinod (1976). In practice, the CCA package by González et al. (2008) can implement a version of the regularized CCA. To make the sample covariance matrices saying Σ̂x and Σ̂y, invertible, in González et al. (2008), they are replaced with Σ̂ λ1 x = Σ̂x + λ1Ip and Σ̂ λ2 y = Σ̂y + λ1Ir. The optimal values of λ1 and λ2 are chosen by maximizing a cross-validation score throughout the two-dimensional grid search. Although it is discussed that a relatively small grid of reasonable values for λ1 and λ2 can lesson intensive computing in González et al. (2008), it is still time-consuming as observed in later sections. Additionally, fast regularized CCA and robust CCA via projection-pursuit are recently developed in Cruz-Cano (2012) and Alfons et al. (2016), respectively. Another version of CCA to handle max(p, ","PeriodicalId":20974,"journal":{"name":"R J.","volume":"44 1","pages":"7"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91542743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multiple Imputation and Synthetic Data Generation with NPBayesImputeCat 基于NPBayesImputeCat的多重输入与合成数据生成

R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-080

Jingchen Hu, O. Akande, Quanli Wang

{"title":"Multiple Imputation and Synthetic Data Generation with NPBayesImputeCat","authors":"Jingchen Hu, O. Akande, Quanli Wang","doi":"10.32614/rj-2021-080","DOIUrl":"https://doi.org/10.32614/rj-2021-080","url":null,"abstract":"In many contexts, missing data and disclosure control are ubiquitous and challenging issues. In particular, at statistical agencies, the respondent-level data they collect from surveys and censuses can suffer from high rates of missingness. Furthermore, agencies are obliged to protect respondents’ privacy when publishing the collected data for public use. The NPBayesImputeCat R package, introduced in this paper, provides routines to i) create multiple imputations for missing data and ii) create synthetic data for statistical disclosure control, for multivariate categorical data, with or without structural zeros. We describe the Dirichlet process mixture of products of the multinomial distributions model used in the package and illustrate various uses of the package using data samples from the American Community Survey (ACS). We also compare results of the missing data imputation to the mice R package and those of the synthetic data generation to the synthpop R package. Introduction and background Multiple imputation for missing data Missing data problems arise in many statistical analyses. To impute missing values, multiple imputation, first proposed by Rubin (1987), has been widely adopted. This approach estimates predictive models based on the observed data, fills in missing values with draws from the predictive models, and produces multiple imputed and completed datasets. Data analysts then apply standard statistical analyses (e.g., regression analysis) on each imputed dataset and use appropriate combining rules to obtain valid point estimates and variance estimates (Rubin, 1987). As a brief review of the multiple imputation combining rules for missing data, let q be the completed data estimator of some estimand of interest Q, and let u be the estimator of the variance of q. For l = 1, . . . , m, let q(l) and u(l) be the values of q and u in the lth completed dataset. The multiple imputation estimate of Q is equal to q̄m = ∑l=1 q (l)/m, and the estimated variance associated with q̄m is equal to Tm = (1 + 1/m)bm + ūm , where bm = ∑l=1(q (l) − q̄m)/(m − 1) and ūm = ∑l=1 u (l)/m. Inferences for Q are based on (q̄m − Q) ∼ tv(0, Tm), where tv is a t-distribution with v = (m − 1)(1 + ūm/[(1 + 1/m)bm]) degrees of freedom. Multiple imputation by chained equations (MICE, Buuren and Groothuis-Oudshoorn (2011)) remains the most popular method for generating multiple completed datasets after multiple imputation. Under MICE, one specifies univariate conditional models separately for each variable, usually using generalized linear models (GLMs) or classification and regression trees (CART Breiman et al. (1984); Burgette and Reiter (2010)), and then iteratively samples plausible predicted values from the sequence of conditional models . For implementing MICE in R, most analysts use the mice package. For an in-depth review of the MICE algorithm, see Buuren and Groothuis-Oudshoorn (2011). For more details and reviews, see Rubin (1996), Harel and Zhou (2007), R","PeriodicalId":20974,"journal":{"name":"R J.","volume":"24 1","pages":"25"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84673342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

RobustBF: An R Package for Robust Solution to the Behrens-Fisher Problem 鲁棒bf: Behrens-Fisher问题鲁棒解的R包

R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-107

Gamze Güven, S. Acitas, Hatice Samkar, B. Şenoğlu

引用次数: 0

BayesSenMC: an R package for Bayesian Sensitivity Analysis of Misclassification BayesSenMC:一个用于误分类贝叶斯敏感性分析的R包

R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-097

Jinhui Yang, Lifeng Lin, H. Chu

引用次数: 0

IndexNumber: An R Package for Measuring the Evolution of Magnitudes IndexNumber:一个测量震级演化的R包

R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-038

A. Saavedra-Nieves, P. Saavedra-Nieves

引用次数: 0