Biostatistics and Epidemiology最新文献_第6页

Transforming data into actionable insights 将数据转化为可操作的见解

Biostatistics and Epidemiology Pub Date : 2020-01-01 DOI: 10.1080/24709360.2019.1704127

C. Clancy

引用次数: 0

Making causal inferences about treatment effect sizes from observational datasets 根据观察数据集对治疗效果大小进行因果推断

Biostatistics and Epidemiology Pub Date : 2020-01-01 DOI: 10.1080/24709360.2019.1681211

T. Kashner, Steven S. Henley, R. Golden, Xiao‐Hua Zhou

{"title":"Making causal inferences about treatment effect sizes from observational datasets","authors":"T. Kashner, Steven S. Henley, R. Golden, Xiao‐Hua Zhou","doi":"10.1080/24709360.2019.1681211","DOIUrl":"https://doi.org/10.1080/24709360.2019.1681211","url":null,"abstract":"In the era of big data and cloud computing, analysts need statistical models to go beyond predicting outcomes to forecasting how outcomes change when decision-makers intervene to change one or more causal factors. This paper reviews methods to estimate the causal effects of treatment choices on patient health outcomes using observational datasets. Methods are limited to those that model choice of treatment (propensity scoring) and treatment outcomes (instrumental variable, difference in differences, control function). A regression framework was developed to show how unobserved confounding covariates and heterogeneous outcomes can introduce biases to effect size estimates. In response to criticisms that outcome approaches are not systematic and subject to model misspecification error, we extend the control function approach of Lu and White by applying Best Approximating Model technology (BAM-CF). Results from simulation experiments are presented to compare biases between BAM-CF and propensity scoring in the presence of an unobserved confounder. We conclude no one strategy is ‘optimal’ for all datasets, and analyst should consider multiple approaches to assess robustness. For both observational and randomized datasets, researchers should assess how moderating covariates impact estimates of treatment effect sizes so that clinicians can understand what is best for each individual patient.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"4 1","pages":"48 - 83"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2019.1681211","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44059431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Common errors of interpretation in biostatistics 生物统计学中常见的解释错误

Biostatistics and Epidemiology Pub Date : 2020-01-01 DOI: 10.1080/24709360.2020.1790085

Elsa Vazquez Arreola, Kyle M. Irimata, Jeffrey R. Wilson

引用次数: 1

Statistical modeling methods: challenges and strategies 统计建模方法：挑战和策略

Biostatistics and Epidemiology Pub Date : 2020-01-01 DOI: 10.1080/24709360.2019.1618653

Steven S. Henley, R. Golden, T. Kashner

{"title":"Statistical modeling methods: challenges and strategies","authors":"Steven S. Henley, R. Golden, T. Kashner","doi":"10.1080/24709360.2019.1618653","DOIUrl":"https://doi.org/10.1080/24709360.2019.1618653","url":null,"abstract":"ABSTRACT Statistical modeling methods are widely used in clinical science, epidemiology, and health services research to analyze data that has been collected in clinical trials as well as observational studies of existing data sources, such as claims files and electronic health records. Diagnostic and prognostic inferences from statistical models are critical to researchers advancing science, clinical practitioners making patient care decisions, and administrators and policy makers impacting the health care system to improve quality and reduce costs. The veracity of such inferences relies not only on the quality and completeness of the collected data, but also statistical model validity. A key component of establishing model validity is determining when a model is not correctly specified and therefore incapable of adequately representing the Data Generating Process (DGP). In this article, model validity is first described and methods designed for assessing model fit, specification, and selection are reviewed. Second, data transformations that improve the model’s ability to represent the DGP are addressed. Third, model search and validation methods are discussed. Finally, methods for evaluating predictive and classification performance are presented. Together, these methods provide a practical framework with recommendations to guide the development and evaluation of statistical models that provide valid statistical inferences.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"4 1","pages":"105 - 139"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2019.1618653","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47377251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Developments and debates on latent variable modeling in diagnostic studies when there is no gold standard 在没有金标准的情况下，诊断研究中潜在变量模型的发展和争论

Biostatistics and Epidemiology Pub Date : 2019-10-15 DOI: 10.1080/24709360.2019.1673623

Zheyu Wang

引用次数: 0

How many clusters exist? Answer via maximum clustering similarity implemented in R 有多少集群存在?通过在R中实现的最大聚类相似性来回答

Biostatistics and Epidemiology Pub Date : 2019-01-01 DOI: 10.1080/24709360.2019.1615770

A. Albatineh, M. Wilcox, B. Zogheib, M. Niewiadomska-Bugaj

{"title":"How many clusters exist? Answer via maximum clustering similarity implemented in R","authors":"A. Albatineh, M. Wilcox, B. Zogheib, M. Niewiadomska-Bugaj","doi":"10.1080/24709360.2019.1615770","DOIUrl":"https://doi.org/10.1080/24709360.2019.1615770","url":null,"abstract":"Finding the number of clusters in a data set is considered as one of the fundamental problems in cluster analysis. This paper integrates maximum clustering similarity (MCS), for finding the optimal number of clusters, into R statistical software through the package MCSim. The similarity between the two clustering methods is calculated at the same number of clusters, using Rand [Objective criteria for the evaluation of clustering methods. J Am Stat Assoc. 1971;66:846–850.] and Jaccard [The distribution of the flora of the alpine zone. New Phytologist. 1912;11:37–50.] indices, corrected for chance agreement. The number of clusters at which the index attains its maximum with most frequency is a candidate for the optimal number of clusters. Unlike other criteria, MCS can be used with circular data. Seven clustering algorithms, existing in R, are implemented in MCSim. A graph of the number of clusters vs. clusters similarity using corrected similarity indices is produced. Values of the similarity indices and a clustering tree (dendrogram) are produced. Several examples including simulated, real, and circular data sets are presented to show how MCSim successfully works in practice.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"3 1","pages":"62 - 79"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2019.1615770","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42954294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cohort study design for illness-death processes with disease status under intermittent observation 间歇观察下疾病状态下疾病死亡过程的队列研究设计

Biostatistics and Epidemiology Pub Date : 2019-01-01 DOI: 10.1080/24709360.2019.1699341

Nathalie C. Moon, Leilei Zeng, R. Cook

引用次数: 0

Modified sparse functional principal component analysis for fMRI data process 改进的稀疏功能主成分分析在fMRI数据处理中的应用

Biostatistics and Epidemiology Pub Date : 2019-01-01 DOI: 10.1080/24709360.2019.1591072

Zhengyang Fang, J. Y. Han, N. Simon, Xiaoping Zhou

引用次数: 1

A response adaptive design for ordinal categorical responses weighing the cumulative odds ratios 衡量累积优势比的有序分类反应的反应自适应设计

Biostatistics and Epidemiology Pub Date : 2019-01-01 DOI: 10.1080/24709360.2019.1660111

A. Biswas, Rahul Bhattacharya, Soumyadeep Das

引用次数: 2

Regression Trees for Longitudinal Data with Baseline Covariates. 具有基线协变量的纵向数据回归树。

Biostatistics and Epidemiology Pub Date : 2019-01-01 Epub Date: 2018-12-31 DOI: 10.1080/24709360.2018.1557797

Madan Gopal Kundu, Jaroslaw Harezlak

{"title":"Regression Trees for Longitudinal Data with Baseline Covariates.","authors":"Madan Gopal Kundu, Jaroslaw Harezlak","doi":"10.1080/24709360.2018.1557797","DOIUrl":"https://doi.org/10.1080/24709360.2018.1557797","url":null,"abstract":"Longitudinal changes in a population of interest are often heterogeneous and may be influenced by a combination of baseline factors. In such cases, traditional linear mixed effects models (Laird and Ware, 1982) assuming common parametric form for the mean structure may not be applicable. We show that the regression tree methodology for longitudinal data can identify and characterize longitudinally homogeneous subgroups. Most of the currently available regression tree construction methods are either limited to a repeated measures scenario or combine the heterogeneity among subgroups with the random inter-subject variability. We propose a longitudinal classification and regression tree (LongCART) algorithm under conditional inference framework (Hothorn, Hornik and Zeileis, 2006) that overcomes these limitations utilizing a two-step approach. The LongCART algorithm first selects the partitioning variable via a parameter instability test and then finds the optimal split for the selected partitioning variable. Thus, at each node, the decision of further splitting is type-I error controlled and thus it guards against variable selection bias, over-fitting and spurious splitting. We have obtained the asymptotic results for the proposed instability test and examined its finite sample behavior through simulation studies. Comparative performance of LongCART algorithm were evaluated empirically via simulation studies. Finally, we applied LongCART to study the longitudinal changes in choline levels among HIV-positive patients.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"3 1","pages":"1-22"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2018.1557797","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36896395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10