arXiv - STAT - Methodology最新文献

筛选
英文 中文
Robust Elicitable Functionals 稳健的可激发函数
arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04412
Kathleen E. Miao, Silvana M. Pesenti
{"title":"Robust Elicitable Functionals","authors":"Kathleen E. Miao, Silvana M. Pesenti","doi":"arxiv-2409.04412","DOIUrl":"https://doi.org/arxiv-2409.04412","url":null,"abstract":"Elicitable functionals and (strict) consistent scoring functions are of\u0000interest due to their utility of determining (uniquely) optimal forecasts, and\u0000thus the ability to effectively backtest predictions. However, in practice,\u0000assuming that a distribution is correctly specified is too strong a belief to\u0000reliably hold. To remediate this, we incorporate a notion of statistical\u0000robustness into the framework of elicitable functionals, meaning that our\u0000robust functional accounts for \"small\" misspecifications of a baseline\u0000distribution. Specifically, we propose a robustified version of elicitable\u0000functionals by using the Kullback-Leibler divergence to quantify potential\u0000misspecifications from a baseline distribution. We show that the robust\u0000elicitable functionals admit unique solutions lying at the boundary of the\u0000uncertainty region. Since every elicitable functional possesses infinitely many\u0000scoring functions, we propose the class of b-homogeneous strictly consistent\u0000scoring functions, for which the robust functionals maintain desirable\u0000statistical properties. We show the applicability of the REF in two examples:\u0000in the reinsurance setting and in robust regression problems.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Machine Learning for Official Statistics: A Statistical Manifesto 利用机器学习进行官方统计:统计宣言
arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04365
Marco Puts, David Salgado, Piet Daas
{"title":"Leveraging Machine Learning for Official Statistics: A Statistical Manifesto","authors":"Marco Puts, David Salgado, Piet Daas","doi":"arxiv-2409.04365","DOIUrl":"https://doi.org/arxiv-2409.04365","url":null,"abstract":"It is important for official statistics production to apply ML with\u0000statistical rigor, as it presents both opportunities and challenges. Although\u0000machine learning has enjoyed rapid technological advances in recent years, its\u0000application does not possess the methodological robustness necessary to produce\u0000high quality statistical results. In order to account for all sources of error\u0000in machine learning models, the Total Machine Learning Error (TMLE) is\u0000presented as a framework analogous to the Total Survey Error Model used in\u0000survey methodology. As a means of ensuring that ML models are both internally\u0000valid as well as externally valid, the TMLE model addresses issues such as\u0000representativeness and measurement errors. There are several case studies\u0000presented, illustrating the importance of applying more rigor to the\u0000application of machine learning in official statistics.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"192 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modelling multivariate spatio-temporal data with identifiable variational autoencoders 用可识别变异自动编码器建立多变量时空数据模型
arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04162
Mika Sipilä, Claudia Cappello, Sandra De Iaco, Klaus Nordhausen, Sara Taskinen
{"title":"Modelling multivariate spatio-temporal data with identifiable variational autoencoders","authors":"Mika Sipilä, Claudia Cappello, Sandra De Iaco, Klaus Nordhausen, Sara Taskinen","doi":"arxiv-2409.04162","DOIUrl":"https://doi.org/arxiv-2409.04162","url":null,"abstract":"Modelling multivariate spatio-temporal data with complex dependency\u0000structures is a challenging task but can be simplified by assuming that the\u0000original variables are generated from independent latent components. If these\u0000components are found, they can be modelled univariately. Blind source\u0000separation aims to recover the latent components by estimating the unmixing\u0000transformation based on the observed data only. The current methods for\u0000spatio-temporal blind source separation are restricted to linear unmixing, and\u0000nonlinear variants have not been implemented. In this paper, we extend\u0000identifiable variational autoencoder to the nonlinear nonstationary\u0000spatio-temporal blind source separation setting and demonstrate its performance\u0000using comprehensive simulation studies. Additionally, we introduce two\u0000alternative methods for the latent dimension estimation, which is a crucial\u0000task in order to obtain the correct latent representation. Finally, we\u0000illustrate the proposed methods using a meteorological application, where we\u0000estimate the latent dimension and the latent components, interpret the\u0000components, and show how nonstationarity can be accounted and prediction\u0000accuracy can be improved by using the proposed nonlinear blind source\u0000separation method as a preprocessing method.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Average Causal Effect Estimation in DAGs with Hidden Variables: Extensions of Back-Door and Front-Door Criteria 具有隐藏变量的 DAG 中的平均因果效应估计:后门和前门标准的扩展
arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.03962
Anna Guo, Razieh Nabi
{"title":"Average Causal Effect Estimation in DAGs with Hidden Variables: Extensions of Back-Door and Front-Door Criteria","authors":"Anna Guo, Razieh Nabi","doi":"arxiv-2409.03962","DOIUrl":"https://doi.org/arxiv-2409.03962","url":null,"abstract":"The identification theory for causal effects in directed acyclic graphs\u0000(DAGs) with hidden variables is well-developed, but methods for estimating and\u0000inferring functionals beyond the g-formula remain limited. Previous studies\u0000have proposed semiparametric estimators for identifiable functionals in a broad\u0000class of DAGs with hidden variables. While demonstrating double robustness in\u0000some models, existing estimators face challenges, particularly with density\u0000estimation and numerical integration for continuous variables, and their\u0000estimates may fall outside the parameter space of the target estimand. Their\u0000asymptotic properties are also underexplored, especially when using flexible\u0000statistical and machine learning models for nuisance estimation. This study\u0000addresses these challenges by introducing novel one-step corrected plug-in and\u0000targeted minimum loss-based estimators of causal effects for a class of DAGs\u0000that extend classical back-door and front-door criteria (known as the treatment\u0000primal fixability criterion in prior literature). These estimators leverage\u0000machine learning to minimize modeling assumptions while ensuring key\u0000statistical properties such as asymptotic linearity, double robustness,\u0000efficiency, and staying within the bounds of the target parameter space. We\u0000establish conditions for nuisance functional estimates in terms of L2(P)-norms\u0000to achieve root-n consistent causal effect estimates. To facilitate practical\u0000application, we have developed the flexCausal package in R.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142225126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incorporating external data for analyzing randomized clinical trials: A transfer learning approach 结合外部数据分析随机临床试验:迁移学习法
arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04126
Yujia Gu, Hanzhong Liu, Wei Ma
{"title":"Incorporating external data for analyzing randomized clinical trials: A transfer learning approach","authors":"Yujia Gu, Hanzhong Liu, Wei Ma","doi":"arxiv-2409.04126","DOIUrl":"https://doi.org/arxiv-2409.04126","url":null,"abstract":"Randomized clinical trials are the gold standard for analyzing treatment\u0000effects, but high costs and ethical concerns can limit recruitment, potentially\u0000leading to invalid inferences. Incorporating external trial data with similar\u0000characteristics into the analysis using transfer learning appears promising for\u0000addressing these issues. In this paper, we present a formal framework for\u0000applying transfer learning to the analysis of clinical trials, considering\u0000three key perspectives: transfer algorithm, theoretical foundation, and\u0000inference method. For the algorithm, we adopt a parameter-based transfer\u0000learning approach to enhance the lasso-adjusted stratum-specific estimator\u0000developed for estimating treatment effects. A key component in constructing the\u0000transfer learning estimator is deriving the regression coefficient estimates\u0000within each stratum, accounting for the bias between source and target data. To\u0000provide a theoretical foundation, we derive the $l_1$ convergence rate for the\u0000estimated regression coefficients and establish the asymptotic normality of the\u0000transfer learning estimator. Our results show that when external trial data\u0000resembles current trial data, the sample size requirements can be reduced\u0000compared to using only the current trial data. Finally, we propose a consistent\u0000nonparametric variance estimator to facilitate inference. Numerical studies\u0000demonstrate the effectiveness and robustness of our proposed estimator across\u0000various scenarios.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142225127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Local times of self-intersection and sample path properties of Volterra Gaussian processes Volterra 高斯过程的局部自交时间和样本路径特性
arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04377
Olga Izyumtseva, Wasiur R. KhudaBukhsh
{"title":"Local times of self-intersection and sample path properties of Volterra Gaussian processes","authors":"Olga Izyumtseva, Wasiur R. KhudaBukhsh","doi":"arxiv-2409.04377","DOIUrl":"https://doi.org/arxiv-2409.04377","url":null,"abstract":"We study a Volterra Gaussian process of the form\u0000$X(t)=int^t_0K(t,s)d{W(s)},$ where $W$ is a Wiener process and $K$ is a\u0000continuous kernel. In dimension one, we prove a law of the iterated logarithm,\u0000discuss the existence of local times and verify a continuous dependence between\u0000the local time and the kernel that generates the process. Furthermore, we prove\u0000the existence of the Rosen renormalized self-intersection local times for a\u0000planar Gaussian Volterra process.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Over-parameterized regression methods and their application to semi-supervised learning 过参数化回归方法及其在半监督学习中的应用
arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04001
Katsuyuki Hagiwara
{"title":"Over-parameterized regression methods and their application to semi-supervised learning","authors":"Katsuyuki Hagiwara","doi":"arxiv-2409.04001","DOIUrl":"https://doi.org/arxiv-2409.04001","url":null,"abstract":"The minimum norm least squares is an estimation strategy under an\u0000over-parameterized case and, in machine learning, is known as a helpful tool\u0000for understanding a nature of deep learning. In this paper, to apply it in a\u0000context of non-parametric regression problems, we established several methods\u0000which are based on thresholding of SVD (singular value decomposition)\u0000components, wihch are referred to as SVD regression methods. We considered\u0000several methods that are singular value based thresholding, hard-thresholding\u0000with cross validation, universal thresholding and bridge thresholding.\u0000Information on output samples is not utilized in the first method while it is\u0000utilized in the other methods. We then applied them to semi-supervised\u0000learning, in which unlabeled input samples are incorporated into kernel\u0000functions in a regressor. The experimental results for real data showed that,\u0000depending on the datasets, the SVD regression methods is superior to a naive\u0000ridge regression method. Unfortunately, there were no clear advantage of the\u0000methods utilizing information on output samples. Furthermore, for depending on\u0000datasets, incorporation of unlabeled input samples into kernels is found to\u0000have certain advantages.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The $infty$-S test via regression quantile affine LASSO 通过回归量子仿射 LASSO 进行 $infty$-S 检验
arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04256
Sylvain Sardy, Xiaoyu Ma, Hugo Gaible
{"title":"The $infty$-S test via regression quantile affine LASSO","authors":"Sylvain Sardy, Xiaoyu Ma, Hugo Gaible","doi":"arxiv-2409.04256","DOIUrl":"https://doi.org/arxiv-2409.04256","url":null,"abstract":"The nonparametric sign test dates back to the early 18th century with a data\u0000analysis by John Arbuthnot. It is an alternative to Gosset's more recent\u0000$t$-test for consistent differences between two sets of observations. Fisher's\u0000$F$-test is a generalization of the $t$-test to linear regression and linear\u0000null hypotheses. Only the sign test is robust to non-Gaussianity. Gutenbrunner\u0000et al. [1993] derived a version of the sign test for linear null hypotheses in\u0000the spirit of the F-test, which requires the difficult estimation of the\u0000sparsity function. We propose instead a new sign test called $infty$-S test\u0000via the convex analysis of a point estimator that thresholds the estimate\u0000towards the null hypothesis of the test.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142225125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fitting the Discrete Swept Skeletal Representation to Slabular Objects 将离散扫掠骨骼表示法拟合到板状物体上
arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04079
Mohsen Taheri, Stephen M. Pizer, Jörn Schulz
{"title":"Fitting the Discrete Swept Skeletal Representation to Slabular Objects","authors":"Mohsen Taheri, Stephen M. Pizer, Jörn Schulz","doi":"arxiv-2409.04079","DOIUrl":"https://doi.org/arxiv-2409.04079","url":null,"abstract":"Statistical shape analysis of slabular objects like groups of hippocampi is\u0000highly useful for medical researchers as it can be useful for diagnoses and\u0000understanding diseases. This work proposes a novel object representation based\u0000on locally parameterized discrete swept skeletal structures. Further, model\u0000fitting and analysis of such representations are discussed. The model fitting\u0000procedure is based on boundary division and surface flattening. The quality of\u0000the model fitting is evaluated based on the symmetry and tidiness of the\u0000skeletal structure as well as the volume of the implied boundary. The power of\u0000the method is demonstrated by visual inspection and statistical analysis of a\u0000synthetic and an actual data set in comparison with an available skeletal\u0000representation.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"67 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A tutorial on panel data analysis using partially observed Markov processes via the R package panelPomp 通过 R 软件包 panelPomp 使用部分观测马尔可夫过程进行面板数据分析的教程
arXiv - STAT - Methodology Pub Date : 2024-09-05 DOI: arxiv-2409.03876
Carles Breto, Jesse Wheeler, Aaron A. King, Edward L. Ionides
{"title":"A tutorial on panel data analysis using partially observed Markov processes via the R package panelPomp","authors":"Carles Breto, Jesse Wheeler, Aaron A. King, Edward L. Ionides","doi":"arxiv-2409.03876","DOIUrl":"https://doi.org/arxiv-2409.03876","url":null,"abstract":"The R package panelPomp supports analysis of panel data via a general class\u0000of partially observed Markov process models (PanelPOMP). This package tutorial\u0000describes how the mathematical concept of a PanelPOMP is represented in the\u0000software and demonstrates typical use-cases of panelPomp. Monte Carlo methods\u0000used for POMP models require adaptation for PanelPOMP models due to the higher\u0000dimensionality of panel data. The package takes advantage of recent advances\u0000for PanelPOMP, including an iterated filtering algorithm, Monte Carlo adjusted\u0000profile methodology and block optimization methodology to assist with the large\u0000parameter spaces that can arise with panel models. In addition, tools for\u0000manipulation of models and data are provided that take advantage of the panel\u0000structure.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信