Journal of Statistical Software最新文献

筛选
英文 中文
deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression 深度回归:半结构化深度分布回归的灵活神经网络框架
IF 5.8 2区 计算机科学
Journal of Statistical Software Pub Date : 2021-04-06 DOI: 10.18637/jss.v105.i02
D. Rügamer, Ruolin Shen, Christina Bukas, Lisa Barros de Andrade e Sousa, Dominik Thalmeier, N. Klein, Chris Kolb, Florian Pfisterer, Philipp Kopper, B. Bischl, C. Müller
{"title":"deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression","authors":"D. Rügamer, Ruolin Shen, Christina Bukas, Lisa Barros de Andrade e Sousa, Dominik Thalmeier, N. Klein, Chris Kolb, Florian Pfisterer, Philipp Kopper, B. Bischl, C. Müller","doi":"10.18637/jss.v105.i02","DOIUrl":"https://doi.org/10.18637/jss.v105.i02","url":null,"abstract":"In this paper we describe the implementation of semi-structured deep distributional regression, a flexible framework to learn conditional distributions based on the combination of additive regression models and deep networks. Our implementation encompasses (1) a modular neural network building system based on the deep learning library pkg{TensorFlow} for the fusion of various statistical and deep learning approaches, (2) an orthogonalization cell to allow for an interpretable combination of different subnetworks, as well as (3) pre-processing steps necessary to set up such models. The software package allows to define models in a user-friendly manner via a formula interface that is inspired by classical statistical model frameworks such as pkg{mgcv}. The packages' modular design and functionality provides a unique resource for both scalable estimation of complex statistical models and the combination of approaches from deep learning and statistics. This allows for state-of-the-art predictive performance while simultaneously retaining the indispensable interpretability of classical statistical models.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"327 3","pages":""},"PeriodicalIF":5.8,"publicationDate":"2021-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72435953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
mosum: A Package for Moving Sums in Change-Point Analysis mosum:一个在变化点分析中移动总数的包
IF 5.8 2区 计算机科学
Journal of Statistical Software Pub Date : 2021-03-19 DOI: 10.18637/JSS.V097.I08
Alexander Meier, C. Kirch, Haeran Cho
{"title":"mosum: A Package for Moving Sums in Change-Point Analysis","authors":"Alexander Meier, C. Kirch, Haeran Cho","doi":"10.18637/JSS.V097.I08","DOIUrl":"https://doi.org/10.18637/JSS.V097.I08","url":null,"abstract":"Time series data, i.e., temporally ordered data, is routinely collected and analysed in in many fields of natural science, economy, technology and medicine, where it is of importance to verify the assumption of stochastic stationarity prior to modeling the data. Nonstationarities in the data are often attributed to structural changes with segments between adjacent change-points being approximately stationary. A particularly important, and thus widely studied, problem in statistics and signal processing is to detect changes in the mean at unknown time points. In this paper, we present the R package mosum, which implements elegant and mathematically well-justified procedures for the multiple mean change problem using the moving sum statistics.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"162 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2021-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88055458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
svars: An R Package for Data-Driven Identification in Multivariate Time Series Analysis svars:一个用于多变量时间序列分析中数据驱动识别的R包
IF 5.8 2区 计算机科学
Journal of Statistical Software Pub Date : 2021-03-19 DOI: 10.18637/JSS.V097.I05
Alexander Lange, B. Dalheimer, H. Herwartz, Simone Maxand
{"title":"svars: An R Package for Data-Driven Identification in Multivariate Time Series Analysis","authors":"Alexander Lange, B. Dalheimer, H. Herwartz, Simone Maxand","doi":"10.18637/JSS.V097.I05","DOIUrl":"https://doi.org/10.18637/JSS.V097.I05","url":null,"abstract":"Structural vector autoregressive (SVAR) models are frequently applied to trace the contemporaneous linkages among (macroeconomic) variables back to an interplay of orthogonal structural shocks. Under Gaussianity the structural parameters are unidentified without additional (often external and not data-based) information. In contrast, the often reasonable assumption of heteroskedastic and/or non-Gaussian model disturbances offers the possibility to identify unique structural shocks. We describe the R package svars which implements statistical identification techniques that can be both heteroskedasticity-based or independence-based. Moreover, it includes a rich variety of analysis tools that are well known in the SVAR literature. Next to a comprehensive review of the theoretical background, we provide a detailed description of the associated R functions. Furthermore, a macroeconomic application serves as a step-by-step guide on how to apply these functions to the identification and interpretation of structural VAR models.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"27 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2021-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76133121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
FamEvent: An R Package for Generating and Modeling Time-to-Event Data in Family Designs. FamEvent:用于在家庭设计中生成时间到事件数据并对其进行建模的 R 软件包。
IF 5.8 2区 计算机科学
Journal of Statistical Software Pub Date : 2021-03-01 Epub Date: 2021-03-19 DOI: 10.18637/jss.v097.i07
Yun-Hee Choi, Laurent Briollais, Wenqing He, Karen Kopciuk
{"title":"FamEvent: An R Package for Generating and Modeling Time-to-Event Data in Family Designs.","authors":"Yun-Hee Choi, Laurent Briollais, Wenqing He, Karen Kopciuk","doi":"10.18637/jss.v097.i07","DOIUrl":"10.18637/jss.v097.i07","url":null,"abstract":"<p><p><b>FamEvent</b> is a comprehensive R package for simulating and modelling age-at-disease onset in families carrying a rare gene mutation. The package can simulate complex family data for variable time-to-event outcomes under three common family study designs (population, high-risk clinic and multi-stage) with various levels of missing genetic information among family members. Residual familial correlation can be induced through the inclusion of a frailty term or a second gene. Disease-gene carrier probabilities are evaluated assuming Mendelian transmission or empirically from the data. When genetic information on the disease gene is missing, an Expectation-Maximization algorithm is employed to calculate the carrier probabilities. Penetrance model functions with ascertainment correction adapted to the sampling design provide age-specific cumulative disease risks by sex, mutation status, and other covariates for simulated data as well as real data analysis. Robust standard errors and 95% confidence intervals are available for these estimates. Plots of pedigrees and penetrance functions based on the fitted model provide graphical displays to evaluate and summarize the models.</p>","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"97 7","pages":""},"PeriodicalIF":5.8,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8427460/pdf/nihms-1735562.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39408263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
intRinsic: An R Package for Model-Based Estimation of the Intrinsic Dimension of a Dataset 一个基于模型估计数据集内在维数的R包
IF 5.8 2区 计算机科学
Journal of Statistical Software Pub Date : 2021-02-23 DOI: 10.18637/jss.v106.i09
Francesco Denti
{"title":"intRinsic: An R Package for Model-Based Estimation of the Intrinsic Dimension of a Dataset","authors":"Francesco Denti","doi":"10.18637/jss.v106.i09","DOIUrl":"https://doi.org/10.18637/jss.v106.i09","url":null,"abstract":"This article illustrates intRinsic, an R package that implements novel state-of-the-art likelihood-based estimators of the intrinsic dimension of a dataset, an essential quantity for most dimensionality reduction techniques. In order to make these novel estimators easily accessible, the package contains a small number of high-level functions that rely on a broader set of efficient, low-level routines. Generally speaking, intRinsic encompasses models that fall into two categories: homogeneous and heterogeneous intrinsic dimension estimators. The first category contains the two nearest neighbors estimator, a method derived from the distributional properties of the ratios of the distances between each data point and its first two closest neighbors. The functions dedicated to this method carry out inference under both the frequentist and Bayesian frameworks. In the second category, we find the heterogeneous intrinsic dimension algorithm, a Bayesian mixture model for which an efficient Gibbs sampler is implemented. After presenting the theoretical background, we demonstrate the performance of the models on simulated datasets. This way, we can facilitate the exposition by immediately assessing the validity of the results. Then, we employ the package to study the intrinsic dimension of the Alon dataset, obtained from a famous microarray experiment. Finally, we show how the estimation of homogeneous and heterogeneous intrinsic dimensions allows us to gain valuable insights into the topological structure of a dataset.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"14 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2021-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85981197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
sensobol: An R Package to Compute Variance-Based Sensitivity Indices sensobol:一个计算基于方差的灵敏度指数的R包
IF 5.8 2区 计算机科学
Journal of Statistical Software Pub Date : 2021-01-22 DOI: 10.18637/jss.v102.i05
A. Puy, S. L. Piano, Andrea Saltelli, S. Levin
{"title":"sensobol: An R Package to Compute Variance-Based Sensitivity Indices","authors":"A. Puy, S. L. Piano, Andrea Saltelli, S. Levin","doi":"10.18637/jss.v102.i05","DOIUrl":"https://doi.org/10.18637/jss.v102.i05","url":null,"abstract":"The R package\"sensobol\"provides several functions to conduct variance-based uncertainty and sensitivity analysis, from the estimation of sensitivity indices to the visual representation of the results. It implements several state-of-the-art first and total-order estimators and allows the computation of up to third-order effects, as well as of the approximation error, in a swift and user-friendly way. Its flexibility makes it also appropriate for models with either a scalar or a multivariate output. We illustrate its functionality by conducting a variance-based sensitivity analysis of three classic models: the Sobol' (1998) G function, the logistic population growth model of Verhulst (1845), and the spruce budworm and forest model of Ludwig, Jones and Holling (1976).","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"15 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2021-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86930207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package 非参数机器学习和贝叶斯加性回归树的高效计算:BART R包
IF 5.8 2区 计算机科学
Journal of Statistical Software Pub Date : 2021-01-14 DOI: 10.18637/JSS.V097.I01
R. Sparapani, Charles Spanbauer, R. McCulloch
{"title":"Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package","authors":"R. Sparapani, Charles Spanbauer, R. McCulloch","doi":"10.18637/JSS.V097.I01","DOIUrl":"https://doi.org/10.18637/JSS.V097.I01","url":null,"abstract":"In this article, we introduce the BART R package which is an acronym for Bayesian additive regression trees. BART is a Bayesian nonparametric, machine learning, ensemble predictive modeling method for continuous, binary, categorical and time-to-event outcomes. Furthermore, BART is a tree-based, black-box method which fits the outcome to an arbitrary random function, f , of the covariates. The BART technique is relatively computationally efficient as compared to its competitors, but large sample sizes can be demanding. Therefore, the BART package includes efficient state-of-the-art implementations for continuous, binary, categorical and time-to-event outcomes that can take advantage of modern off-the-shelf hardware and software multi-threading technology. The BART package is written in C++ for both programmer and execution efficiency. The BART package takes advantage of multi-threading via forking as provided by the parallel package and OpenMP when available and supported by the platform. The ensemble of binary trees produced by a BART fit can be stored and re-used later via the R predict function. In addition to being an R package, the installed BART routines can be called directly from C++. The BART package provides the tools for your BART toolbox.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"115 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2021-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86293135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
The R Package forestinventory: Design-Based Global and Small Area Estimations for Multiphase Forest Inventories R包森林清查:基于设计的多阶段森林清查全局和小面积估算
IF 5.8 2区 计算机科学
Journal of Statistical Software Pub Date : 2021-01-14 DOI: 10.18637/JSS.V097.I04
Andreas Hill, Alexander Massey, D. Mandallaz
{"title":"The R Package forestinventory: Design-Based Global and Small Area Estimations for Multiphase Forest Inventories","authors":"Andreas Hill, Alexander Massey, D. Mandallaz","doi":"10.18637/JSS.V097.I04","DOIUrl":"https://doi.org/10.18637/JSS.V097.I04","url":null,"abstract":"Forest inventories provide reliable evidence-based information to assess the state and development of forests over time. They typically consist of a random sample of plot locations in the forest that are assessed individually by field crews. Due to the high costs of these terrestrial campaigns, remote sensing information available in high quantity and low costs is frequently incorporated in the estimation process in order to reduce inventory costs or improve estimation precision. With respect to this objective, the application of multiphase forest inventory methods (e.g., double- and triple-sampling regression estimators) has proved to be efficient. While these methods have been successfully applied in practice, the availability of open-source software has been rare if not non-existent. The R package forestinventory provides a comprehensive set of global and small area regression estimators for multiphase forest inventories under simple and cluster sampling. The implemented methods have been demonstrated in various scientific studies ranging from small to large scale forest inventories, and can be used for post-stratification, regression and regression within strata. This article gives an extensive review of the mathematical theory of this family of design-based estimators, puts them into a common framework of forest inventory scenarios and demonstrates their application in the R environment.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"170 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2021-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76306607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
microsynth: Synthetic Control Methods for Disaggregated and Micro-Level Data in R microsynth: R中分解和微观级数据的综合控制方法
IF 5.8 2区 计算机科学
Journal of Statistical Software Pub Date : 2021-01-14 DOI: 10.18637/JSS.V097.I02
Michael W Robbins, Steven Davenport
{"title":"microsynth: Synthetic Control Methods for Disaggregated and Micro-Level Data in R","authors":"Michael W Robbins, Steven Davenport","doi":"10.18637/JSS.V097.I02","DOIUrl":"https://doi.org/10.18637/JSS.V097.I02","url":null,"abstract":"The R package microsynth has been developed for implementation of the synthetic control methodology for comparative case studies involving micro- or meso-level data. The methodology implemented within microsynth is designed to assess the efficacy of a treatment or intervention within a well-defined geographic region that is itself a composite of several smaller regions (where data are available at the more granular level for comparison regions as well). The effect of the intervention on one or more time-varying outcomes is evaluated by determining a synthetic control region that resembles the treatment region across pre-intervention values of the outcome(s) and time-invariant covariates and that is a weighted composite of many untreated comparison regions. The microsynth procedure includes functionality that enables its user to (1) calculate weights for synthetic control, (2) tabulate results for statistical inferences, and (3) create time series plots of outcomes for treatment and synthetic control. In this article, microsynth is described in detail and its application is illustrated using data from a drug market intervention in Seattle, WA.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"10 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2021-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79200084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Simulating Survival Data Using the simsurv R Package 使用simsurv R包模拟生存数据
IF 5.8 2区 计算机科学
Journal of Statistical Software Pub Date : 2021-01-14 DOI: 10.18637/JSS.V097.I03
S. Brilleman, R. Wolfe, M. Moreno-Betancur, M. Crowther
{"title":"Simulating Survival Data Using the simsurv R Package","authors":"S. Brilleman, R. Wolfe, M. Moreno-Betancur, M. Crowther","doi":"10.18637/JSS.V097.I03","DOIUrl":"https://doi.org/10.18637/JSS.V097.I03","url":null,"abstract":"The simsurv R package allows users to simulate survival (i.e., time-to-event) data from standard parametric distributions (exponential, Weibull, and Gompertz), two-component mixture distributions, or a user-defined hazard function. Baseline covariates can be included under a proportional hazards assumption. Clustered event times, for example individuals within a family, are also easily accommodated. Time-dependent effects (i.e., nonproportional hazards) can be included by interacting covariates with linear time or a user-defined function of time. Under a user-defined hazard function, event times can be generated for a variety of complex models such as flexible (spline-based) baseline hazards, models with time-varying covariates, or joint longitudinal-survival models.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"1 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2021-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76268703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信