R J.最新文献

筛选
英文 中文
Package wsbackfit for Smooth Backfitting Estimation of Generalized Structured Models 广义结构模型的光滑反拟合估计
R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-042
J. Roca-Pardiñas, M. Rodríguez-Álvarez, S. Sperlich
{"title":"Package wsbackfit for Smooth Backfitting Estimation of Generalized Structured Models","authors":"J. Roca-Pardiñas, M. Rodríguez-Álvarez, S. Sperlich","doi":"10.32614/rj-2021-042","DOIUrl":"https://doi.org/10.32614/rj-2021-042","url":null,"abstract":"A package is introduced that provides the weighted smooth backfitting estimator for a large family of popular semiparametric regression models. This family is known as generalized structured models, comprising, for example, generalized varying coefficient model, generalized additive models, mixtures, potentially including parametric parts. The kernel based weighted smooth backfitting belongs to the statistically most efficient procedures for this model class. Its asymptotic properties are well understood thanks to the large body of literature about this estimator. The introduced weights allow for the inclusion of sampling weights, trimming, and efficient estimation under heteroscedasticity. Further options facilitate an easy handling of aggregated data, prediction, and the presentation of estimation results. Cross-validation methods are provided which can be used for model and bandwidth selection.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"13 1","pages":"330"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78498328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wide-to-tall Data Reshaping Using Regular Expressions and the nc Package 使用正则表达式和nc包进行从宽到高的数据重塑
R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-029
T. Hocking
{"title":"Wide-to-tall Data Reshaping Using Regular Expressions and the nc Package","authors":"T. Hocking","doi":"10.32614/rj-2021-029","DOIUrl":"https://doi.org/10.32614/rj-2021-029","url":null,"abstract":"Regular expressions are powerful tools for extracting tables from non-tabular text data. Capturing regular expressions that describe the information to extract from column names can be especially useful when reshaping a data table from wide (few rows with many regularly named columns) to tall (fewer columns with more rows). We present the R package nc (short for named capture), which provides functions for wide-to-tall data reshaping using regular expressions. We describe the main new ideas of nc, and provide detailed comparisons with related R packages (stats, utils, data.table, tidyr, tidyfast, tidyfst, reshape2, cdata).","PeriodicalId":20974,"journal":{"name":"R J.","volume":"66 1","pages":"69"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73837511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
StratigrapheR: Concepts for Litholog Generation in R 地层学家:R中岩性生成的概念
R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-039
Sébastien Wouters, A. Silva, F. Boulvain, X. Devleeschouwer
{"title":"StratigrapheR: Concepts for Litholog Generation in R","authors":"Sébastien Wouters, A. Silva, F. Boulvain, X. Devleeschouwer","doi":"10.32614/rj-2021-039","DOIUrl":"https://doi.org/10.32614/rj-2021-039","url":null,"abstract":"The StratigrapheR package proposes new concepts for the generation of lithological logs, or lithologs, in R. The generation of lithologs in a scripting environment opens new opportunities for the processing and analysis of stratified geological data. Among the new concepts presented: new plotting and data processing methodologies, new general R functions, and computer-oriented data conventions are provided. The package structure allows for these new concepts to be further improved, which can be done independently by any R user. The current limitations of the package are highlighted, along with the limitations in R for geological data processing, to help identify the best paths for improvements. Introduction StratigrapheR is a package implemented in the open-source programming environment R. StratigrapheR endeavors to explore new concepts to process stratified geological data. These concepts are provided to answer a major difficulty posed by such data; namely a large amount of field observations of varied nature, sometimes localized and small-scale, can carry information on large-scale processes. Visualizing the relevant observations all at once is therefore difficult. The usual answer to this problem in successions of stratified rocks is to report observations in a schematic form: the lithological log, or litholog (e.g., Fig. 1). The litholog is an essential tool in sedimentology and stratigraphy and proves to be equally invaluable in other fields such as volcanology, igneous petrology, or paleontology. Ideally, any data contained in a litholog should be available in a reproducible form. Therefore, the challenge at hand is what we would call \"from art to useful data\"; how can we best extract and/or process the information contained in a litholog, designed to be as visually informative as possible (see again Fig. 1). 28 29 30 31 32 33 34 44 45a 45b 45c 46 47 48 49 51 35 52a 52b 60a 60b 60c 61 HIATUS lamellar stromatoporoids branching stromatoporoids lamellar tabulate corals branching tabulate corals brachiopods crinoids receptaculitids small fenestrae large fenestrae","PeriodicalId":20974,"journal":{"name":"R J.","volume":"29 1","pages":"70"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90626911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automating Reproducible, Collaborative Clinical Trial Document Generation with the listdown Package 使用listdown包自动生成可重复、协作的临床试验文件
R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-051
M. Kane, Xun Jiang, Simon Urbanek
{"title":"Automating Reproducible, Collaborative Clinical Trial Document Generation with the listdown Package","authors":"M. Kane, Xun Jiang, Simon Urbanek","doi":"10.32614/rj-2021-051","DOIUrl":"https://doi.org/10.32614/rj-2021-051","url":null,"abstract":"The conveyance of clinical trial explorations and analysis results from a statistician to a clinical investigator is a critical component to the drug development and clinical research cycle. Automating the process of generating documents for data descriptions, summaries, exploration, and analysis allows statistician to provide a more comprehensive view of the information captured by a clinical trial and efficient generation of these documents allows the statistican to focus more on the conceptual development of a trial or trial analysis and less on the implementation of the summaries and results on which decisions are made. This paper explores the use of the listdown package for automating reproducible documents in clinical trials that facilitate the collaboration between statisticians and clinicians as well as defining an analysis pipeline for document generation. Background and Introduction The conveyance of clinical trial explorations and analysis results from a statistician to a clinical investigator is an often overlooked but critical component to the drug development and clinical research cycle. Graphs, tables, and other analysis artifacts are at the nexus of these collaborations. They facilitate identifying problems and bugs in the data preparation and processing stage, they help to build an intuitive understanding of mechanisms of disease and their treatment, they elucidate prognostic and predictive relationships, they provide insight that results in new hypotheses, and they convince researchers of analyses testing hypotheses. Despite their importance, the process of generating these artifacts is usually done in an ad-hoc manner. This is partially because of the nuance and diversity of the hypotheses and scientific questions being interrogated and, to a lesser degree, the variation in clinical data formatting. The usual process usually has a statistician providing a standard set of artifacts, receiving feedback, and providing an updates based on feedback. Work performed for one trial is rarely leveraged on others and as a result, a large amount of work needs to be reproduced for each trial. There are two glaring problems with this approach. First, each analysis of a trial requires a substantial amount of error-prone work. While the variation between trials means some work needs to be done for preparation, exploration, and analysis, there are many aspects of these trials that could be better automated resulting in greater efficiency and accuracy. Second, because this work is challenging, it often occupies the majority of the statisticians effort. Less time is spent on trial design and analysis and the this portion is taken up by a clinician who often has less expertise with the statistical aspects of the trial. As a result, the extra effort spent on processing data undermines statisticians role as a collaborator and relegates them to service provider. Need tools leveraging existing work to more efficiently provide holistic views on trials ","PeriodicalId":20974,"journal":{"name":"R J.","volume":"1 1","pages":"556"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90848279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing Dependence between Point Processes in Time Using IndTestPP 用IndTestPP分析点进程间的时间依赖性
R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-049
A. Cebrián, J. Asín
{"title":"Analyzing Dependence between Point Processes in Time Using IndTestPP","authors":"A. Cebrián, J. Asín","doi":"10.32614/rj-2021-049","DOIUrl":"https://doi.org/10.32614/rj-2021-049","url":null,"abstract":"The need to analyze the dependence between two or more point processes in time appears in many modeling problems related to the occurrence of events, such as the occurrence of climate events at different spatial locations or synchrony detection in spike train analysis. The package IndTestPP provides a general framework for all the steps in this type of analysis, and one of its main features is the implementation of three families of tests to study independence given the intensities of the processes, which are not only useful to assess independence but also to identify factors causing dependence. The package also includes functions for generating different types of dependent point processes, and implements computational statistical inference tools using them. An application to characterize the dependence between the occurrence of extreme heat events in three Spanish locations using the package is shown.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"48 1","pages":"499"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90901395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
miRecSurv Package: Prentice-Williams-Peterson Models with Multiple Imputation of Unknown Number of Previous Episodes miRecSurv包:Prentice-Williams-Peterson模型与先前事件的未知数量的多重Imputation
R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-082
D. Moriña, G. Hernández-Herrera, A. Navarro
{"title":"miRecSurv Package: Prentice-Williams-Peterson Models with Multiple Imputation of Unknown Number of Previous Episodes","authors":"D. Moriña, G. Hernández-Herrera, A. Navarro","doi":"10.32614/rj-2021-082","DOIUrl":"https://doi.org/10.32614/rj-2021-082","url":null,"abstract":"Left censoring can occur with relative frequency when analysing recurrent events in epidemiological studies, especially observational ones. Concretely, the inclusion of individuals that were already at risk before the effective initiation in a cohort study, may cause the unawareness of prior episodes that have already been experienced, and this will easily lead to biased and inefficient estimates. The miRecSurv package is based on the use of models with specific baseline hazard, with multiple imputation of the number of prior episodes when unknown by means of the COMPoisson distribution, a very flexible count distribution that can handle over-, suband equidispersion, with a stratified model depending on whether the individual had or had not previously been at risk, and the use of a frailty term. The usage of the package is illustrated by means of a real data example based on a occupational cohort study and a simulation study.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"36 1","pages":"321"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82942226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
clustcurv: An R Package for Determining Groups in Multiple Curves 一个R包,用于确定多曲线中的群
R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-032
Nora M. Villanueva, M. Sestelo, Luís Meira-Machado, J. Roca-Pardiñas
{"title":"clustcurv: An R Package for Determining Groups in Multiple Curves","authors":"Nora M. Villanueva, M. Sestelo, Luís Meira-Machado, J. Roca-Pardiñas","doi":"10.32614/rj-2021-032","DOIUrl":"https://doi.org/10.32614/rj-2021-032","url":null,"abstract":"In many situations it could be interesting to ascertain whether groups of curves can be performed, especially when confronted with a considerable number of curves. This paper introduces an R package, known as clustcurv, for determining clusters of curves with an automatic selection of their number. The package can be used for determining groups in multiple survival curves as well as for multiple regression curves. Moreover, it can be used with large numbers of curves. An illustration of the use of clustcurv is provided, using both real data examples and artificial data.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"191 1","pages":"164"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77621929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
RPESE: Risk and Performance Estimators Standard Errors with Serially Dependent Data 风险和绩效评估与序列相关数据的标准误差
R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-106
A. Christidis, R. Martin
{"title":"RPESE: Risk and Performance Estimators Standard Errors with Serially Dependent Data","authors":"A. Christidis, R. Martin","doi":"10.32614/rj-2021-106","DOIUrl":"https://doi.org/10.32614/rj-2021-106","url":null,"abstract":"The Risk and Performance Estimators Standard Errors package RPESE implements a new method for computing accurate standard errors of risk and performance estimators when returns are serially dependent. The new method makes use of the representation of a risk or performance estimator as a summation of a time series of influence-function (IF) transformed returns, and computes estimator standard errors using a sophisticated method of estimating the spectral density at frequency zero of the time series of IF-transformed returns. Two additional packages used by RPESE are introduced, namely RPEIF which computes and provides graphical displays of the IF of risk and performance estimators, and RPEGLMEN which implements a regularized Gamma generalized linear model polynomial fit to the periodogram of the time series of the IF-transformed returns. A Monte Carlo study shows that the new method provides more accurate estimates of standard errors for risk and performance estimators compared to well-known alternative methods in the presence of serial correlation.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"74 1","pages":"624"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82234110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The R Quest: from Users to Developers R任务:从用户到开发者
R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-111
Simon Urbanek
{"title":"The R Quest: from Users to Developers","authors":"Simon Urbanek","doi":"10.32614/rj-2021-111","DOIUrl":"https://doi.org/10.32614/rj-2021-111","url":null,"abstract":"R is not a programming language, and this produces the inherent dichotomy between analytics and software engineering. With the emergence of data science, the opportunity exists to bridge this gap, especially through teaching practices. Genesis: How did we get here? The article “Software Engineering and R Programming: A Call to Action” summarizes the dichotomy between analytics and software engineering in the R ecosystem, provides examples where this leads to problems and proposes what we as R users can do to bridge the gap. Data Analytic Language The fundamental basis of the dichotomy is inherent in the evolution of S and R: they are not programming languages, but they ended up being mistaken for such. S was designed to be a data analytic language: to turn ideas into software quickly and faithfully, often used in “non-programming” style (Chambers, 1998). Its original goal was to enable the statisticians to apply code which was written in programming languages (at the time mostly FORTRAN) to analyze data quickly and interactively for some suitable definition of “interactive” at the time (Becker, 1994). The success of S and then R can be traced to the ability to perform data analysis by applying existing tools to data in creative ways. A data analysis is a quest at every step we learn more about the data which informs our decision about next steps. Whether it is an exploratory data analysis leveraging graphics or computing statistics or fitting models the final goal is typically not known ahead of time, it is obtained by an iterative process of applying tools that we as analysts think may lead us further (Tukey, 1977). It is important to note that this is exactly the opposite of software engineering where there is a well-defined goal: a specification or desired outcome, which simply needs to be expressed in a way understandable to the computer.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"475 1","pages":"697"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79938019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
openSkies - Integration of Aviation Data into the R Ecosystem openSkies -将航空数据集成到R生态系统中
R J. Pub Date : 2021-01-01 DOI: 10.32614/rj-2021-095
Rafael Ayala, D. Ayala, L. S. Vidal, David Ruiz
{"title":"openSkies - Integration of Aviation Data into the R Ecosystem","authors":"Rafael Ayala, D. Ayala, L. S. Vidal, David Ruiz","doi":"10.32614/rj-2021-095","DOIUrl":"https://doi.org/10.32614/rj-2021-095","url":null,"abstract":"Aviation data has become increasingly more accessible to the public thanks to the adoption of technologies such as Automatic Dependent Surveillance-Broadcast (ADS-B) and Mode S, which provide aircraft information over publicly accessible radio channels. Furthermore, the OpenSky Network provides multiple public resources to access such air traffic data from a large network of ADS-B receivers. Here, we present openSkies , the first R package for processing public air traffic data. The package provides an interface to the OpenSky Network resources, standardized data structures to represent the different entities involved in air traffic data and functionalities to analyze and visualize such data. Furthermore, the portability of the implemented data structures makes openSkies easily reusable by other packages, therefore laying the foundation of aviation data engineering in R.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"1 1","pages":"485"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89877538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信