Stats最新文献

筛选
英文 中文
Statistical Analysis in the Presence of Spatial Autocorrelation: Selected Sampling Strategy Effects 存在空间自相关的统计分析:选定的采样策略效果
Stats Pub Date : 2022-12-16 DOI: 10.3390/stats5040081
D. Griffith, R. Plant
{"title":"Statistical Analysis in the Presence of Spatial Autocorrelation: Selected Sampling Strategy Effects","authors":"D. Griffith, R. Plant","doi":"10.3390/stats5040081","DOIUrl":"https://doi.org/10.3390/stats5040081","url":null,"abstract":"Fundamental to most classical data collection sampling theory development is the random drawings assumption requiring that each targeted population member has a known sample selection (i.e., inclusion) probability. Frequently, however, unrestricted random sampling of spatially autocorrelated data is impractical and/or inefficient. Instead, randomly choosing a population subset accounts for its exhibited spatial pattern by utilizing a grid, which often provides improved parameter estimates, such as the geographic landscape mean, at least via its precision. Unfortunately, spatial autocorrelation latent in these data can produce a questionable mean and/or standard error estimate because each sampled population member contains information about its nearby members, a data feature explicitly acknowledged in model-based inference, but ignored in design-based inference. This autocorrelation effect prompted the development of formulae for calculating an effective sample size (i.e., the equivalent number of sample selections from a geographically randomly distributed population that would yield the same sampling error) estimate. Some researchers recently challenged this and other aspects of spatial statistics as being incorrect/invalid/misleading. This paper seeks to address this category of misconceptions, demonstrating that the effective geographic sample size is a valid and useful concept regardless of the inferential basis invoked. Its spatial statistical methodology builds upon the preceding ingredients.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48024679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Robust Testing of Paired Outcomes Incorporating Covariate Effects in Clustered Data with Informative Cluster Size 包含协变量效应的成对结果的稳健性检验
Stats Pub Date : 2022-12-14 DOI: 10.3390/stats5040080
S. Dutta
{"title":"Robust Testing of Paired Outcomes Incorporating Covariate Effects in Clustered Data with Informative Cluster Size","authors":"S. Dutta","doi":"10.3390/stats5040080","DOIUrl":"https://doi.org/10.3390/stats5040080","url":null,"abstract":"Paired outcomes are common in correlated clustered data where the main aim is to compare the distributions of the outcomes in a pair. In such clustered paired data, informative cluster sizes can occur when the number of pairs in a cluster (i.e., a cluster size) is correlated to the paired outcomes or the paired differences. There have been some attempts to develop robust rank-based tests for comparing paired outcomes in such complex clustered data. Most of these existing rank tests developed for paired outcomes in clustered data compare the marginal distributions in a pair and ignore any covariate effect on the outcomes. However, when potentially important covariate data is available in observational studies, ignoring these covariate effects on the outcomes can result in a flawed inference. In this article, using rank based weighted estimating equations, we propose a robust procedure for covariate effect adjusted comparison of paired outcomes in a clustered data that can also address the issue of informative cluster size. Through simulated scenarios and real-life neuroimaging data, we demonstrate the importance of considering covariate effects during paired testing and robust performances of our proposed method in covariate adjusted paired comparisons in complex clustered data settings.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47985565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Extracting Proceedings Data from Court Cases with Machine Learning 用机器学习从法庭案件中提取诉讼数据
Stats Pub Date : 2022-12-13 DOI: 10.3390/stats5040079
Bruno Mathis
{"title":"Extracting Proceedings Data from Court Cases with Machine Learning","authors":"Bruno Mathis","doi":"10.3390/stats5040079","DOIUrl":"https://doi.org/10.3390/stats5040079","url":null,"abstract":"France is rolling out an open data program for all court cases, but with few metadata attached. Reusers will have to use named-entity recognition (NER) within the text body of the case to extract any value from it. Any court case may include up to 26 variables, or labels, that are related to the proceeding, regardless of the case substance. These labels are from different syntactic types: some of them are rare; others are ubiquitous. This experiment compares different algorithms, namely CRF, SpaCy, Flair and DeLFT, to extract proceedings data and uses the learning model assessment capabilities of Kairntech, an NLP platform. It shows that an NER model can apply to this large and diverse set of labels and extract data of high quality. We achieved an 87.5% F1 measure with Flair trained on more than 27,000 manual annotations. Quality may yet be improved by combining NER models by data type.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43284543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regression Models for Lifetime Data: An Overview 寿命数据的回归模型:综述
Stats Pub Date : 2022-12-07 DOI: 10.3390/stats5040078
C. Caroni
{"title":"Regression Models for Lifetime Data: An Overview","authors":"C. Caroni","doi":"10.3390/stats5040078","DOIUrl":"https://doi.org/10.3390/stats5040078","url":null,"abstract":"Two methods dominate the regression analysis of time-to-event data: the accelerated failure time model and the proportional hazards model. Broadly speaking, these predominate in reliability modelling and biomedical applications, respectively. However, many other methods have been proposed, including proportional odds, proportional mean residual life and several other “proportional” models. This paper presents an overview of the field and the concept behind each of these ideas. Multi-parameter modelling is also discussed, in which (in contrast to, say, the proportional hazards model) more than one parameter of the lifetime distribution may depend on covariates. This includes first hitting time (or threshold) regression based on an underlying latent stochastic process. Many of the methods that have been proposed have seen little or no practical use. Lack of user-friendly software is certainly a factor in this. Diagnostic methods are also lacking for most methods.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43006947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Lookup Table Regression Model for Histogram-Valued Symbolic Data 直方图值符号数据的查找表回归模型
Stats Pub Date : 2022-12-04 DOI: 10.3390/stats5040077
M. Ichino
{"title":"The Lookup Table Regression Model for Histogram-Valued Symbolic Data","authors":"M. Ichino","doi":"10.3390/stats5040077","DOIUrl":"https://doi.org/10.3390/stats5040077","url":null,"abstract":"This paper presents the Lookup Table Regression Model (LTRM) for histogram-valued symbolic data. We first transform the given symbolic data to a numerical data table by the quantile method. Then, under the selected response variable, we apply the Monotone Blocks Segmentation (MBS) to the obtained numerical data table. If the selected response variable and some remained explanatory variable(s) organize a monotone structure, the MBS generates a Lookup Table composed of interval values. For a given object, we search the nearest value of an explanatory variable, then the corresponding value of the response variable becomes the estimated value. If the response variable and the explanatory variable(s) are covariate but they follow to a non-monotonic structure, we need to divide the given data into several monotone substructures. For this purpose, we apply the hierarchical conceptual clustering to the given data, and we obtain Multiple Lookup Tables by applying the MBS to each of substructures. We show the usefulness of the proposed method by using an artificial data set and real data sets.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44834828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Addressing Disparities in the Propensity Score Distributions for Treatment Comparisons from Observational Studies 解决观察性研究治疗比较倾向评分分布的差异
Stats Pub Date : 2022-12-02 DOI: 10.3390/stats5040076
Tingting Zhou, M. Elliott, R. Little
{"title":"Addressing Disparities in the Propensity Score Distributions for Treatment Comparisons from Observational Studies","authors":"Tingting Zhou, M. Elliott, R. Little","doi":"10.3390/stats5040076","DOIUrl":"https://doi.org/10.3390/stats5040076","url":null,"abstract":"Propensity score (PS) based methods, such as matching, stratification, regression adjustment, simple and augmented inverse probability weighting, are popular for controlling for observed confounders in observational studies of causal effects. More recently, we proposed penalized spline of propensity prediction (PENCOMP), which multiply-imputes outcomes for unassigned treatments using a regression model that includes a penalized spline of the estimated selection probability and other covariates. For PS methods to work reliably, there should be sufficient overlap in the propensity score distributions between treatment groups. Limited overlap can result in fewer subjects being matched or in extreme weights causing numerical instability and bias in causal estimation. The problem of limited overlap suggests (a) defining alternative estimands that restrict inferences to subpopulations where all treatments have the potential to be assigned, and (b) excluding or down-weighting sample cases where the propensity to receive one of the compared treatments is close to zero. We compared PENCOMP and other PS methods for estimation of alternative causal estimands when limited overlap occurs. Simulations suggest that, when there are extreme weights, PENCOMP tends to outperform the weighted estimators for ATE and performs similarly to the weighted estimators for alternative estimands. We illustrate PENCOMP in two applications: the effect of antiretroviral treatments on CD4 counts using the Multicenter AIDS cohort study (MACS) and whether right heart catheterization (RHC) is a beneficial treatment in treating critically ill patients.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47634803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Bayesian One-Sample Test for Proportion 比例的贝叶斯单样本检验
Stats Pub Date : 2022-12-01 DOI: 10.3390/stats5040075
L. Al-Labadi, Yifan Cheng, Forough Fazeli-Asl, Kyuson Lim, Ya-Fang Weng
{"title":"A Bayesian One-Sample Test for Proportion","authors":"L. Al-Labadi, Yifan Cheng, Forough Fazeli-Asl, Kyuson Lim, Ya-Fang Weng","doi":"10.3390/stats5040075","DOIUrl":"https://doi.org/10.3390/stats5040075","url":null,"abstract":"This paper deals with a new Bayesian approach to the one-sample test for proportion. More specifically, let x=(x1,…,xn) be an independent random sample of size n from a Bernoulli distribution with an unknown parameter θ. For a fixed value θ0, the goal is to test the null hypothesis H0:θ=θ0 against all possible alternatives. The proposed approach is based on using the well-known formula of the Kullback–Leibler divergence between two binomial distributions chosen in a certain way. Then, the difference of the distance from a priori to a posteriori is compared through the relative belief ratio (a measure of evidence). Some theoretical properties of the method are developed. Examples and simulation results are included.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48572391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Bootstrap Method for a Multiple-Imputation Variance Estimator in Survey Sampling 调查抽样中多脉冲方差估计的Bootstrap方法
Stats Pub Date : 2022-11-29 DOI: 10.3390/stats5040074
Lili Yu, Yichuan Zhao
{"title":"A Bootstrap Method for a Multiple-Imputation Variance Estimator in Survey Sampling","authors":"Lili Yu, Yichuan Zhao","doi":"10.3390/stats5040074","DOIUrl":"https://doi.org/10.3390/stats5040074","url":null,"abstract":"Rubin’s variance estimator of the multiple imputation estimator for a domain mean is not asymptotically unbiased. Kim et al. derived the closed-form bias for Rubin’s variance estimator. In addition, they proposed an asymptotically unbiased variance estimator for the multiple imputation estimator when the imputed values can be written as a linear function of the observed values. However, this needs the assumption that the covariance of the imputed values in the same imputed dataset is twice that in the different imputed datasets. In this study, we proposed a bootstrap variance estimator that does not need this assumption. Both theoretical argument and simulation studies show that it was unbiased and asymptotically valid. The new method was applied to the Hox pupil popularity data for illustration.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41728934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing Regional Entrepreneurship: A Bootstrapping Approach in Data Envelopment Analysis 评估区域企业家精神:数据包络分析中的自举方法
Stats Pub Date : 2022-11-28 DOI: 10.3390/stats5040073
I. Tsolas
{"title":"Assessing Regional Entrepreneurship: A Bootstrapping Approach in Data Envelopment Analysis","authors":"I. Tsolas","doi":"10.3390/stats5040073","DOIUrl":"https://doi.org/10.3390/stats5040073","url":null,"abstract":"The aim of the present paper is to demonstrate the viability of using data envelopment analysis (DEA) in a regional context to evaluate entrepreneurial activities. DEA was used to assess regional entrepreneurship in Greece using individual measures of entrepreneurship as inputs and employment rates as outputs. In addition to point estimates, a bootstrap algorithm was used to produce bias-corrected metrics. In the light of the results of the study, the Greek regions perform differently in terms of converting entrepreneurial activity into job creation. Moreover, there is some evidence that unemployment may be a driver of entrepreneurship and thus negatively affects DEA-based inefficiency. The derived indicators can serve as diagnostic tools and can also be used for the design of various interventions at the regional level.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44335683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On the Relation between Lambert W-Function and Generalized Hypergeometric Functions 关于Lambert W函数与广义超几何函数的关系
Stats Pub Date : 2022-11-23 DOI: 10.3390/stats5040072
P. N. Rathie, L. Ozelim
{"title":"On the Relation between Lambert W-Function and Generalized Hypergeometric Functions","authors":"P. N. Rathie, L. Ozelim","doi":"10.3390/stats5040072","DOIUrl":"https://doi.org/10.3390/stats5040072","url":null,"abstract":"In the theory of special functions, finding correlations between different types of functions is of great interest as unifying results, especially when considering issues such as analytic continuation. In the present paper, the relation between Lambert W-function and generalized hypergeometric functions is discussed. It will be shown that it is possible to link these functions by following two different strategies, namely, by means of the direct and inverse Mellin transform of Lambert W-function and by solving the trinomial equation originally studied by Lambert and Euler. The new results can be used both to numerically evaluate Lambert W-function and to study its analytic structure.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41910380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信