arXiv - STAT - Methodology最新文献_第10页

arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04412

Kathleen E. Miao, Silvana M. Pesenti

{"title":"Robust Elicitable Functionals","authors":"Kathleen E. Miao, Silvana M. Pesenti","doi":"arxiv-2409.04412","DOIUrl":"https://doi.org/arxiv-2409.04412","url":null,"abstract":"Elicitable functionals and (strict) consistent scoring functions are of\u0000interest due to their utility of determining (uniquely) optimal forecasts, and\u0000thus the ability to effectively backtest predictions. However, in practice,\u0000assuming that a distribution is correctly specified is too strong a belief to\u0000reliably hold. To remediate this, we incorporate a notion of statistical\u0000robustness into the framework of elicitable functionals, meaning that our\u0000robust functional accounts for \"small\" misspecifications of a baseline\u0000distribution. Specifically, we propose a robustified version of elicitable\u0000functionals by using the Kullback-Leibler divergence to quantify potential\u0000misspecifications from a baseline distribution. We show that the robust\u0000elicitable functionals admit unique solutions lying at the boundary of the\u0000uncertainty region. Since every elicitable functional possesses infinitely many\u0000scoring functions, we propose the class of b-homogeneous strictly consistent\u0000scoring functions, for which the robust functionals maintain desirable\u0000statistical properties. We show the applicability of the REF in two examples:\u0000in the reinsurance setting and in robust regression problems.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging Machine Learning for Official Statistics: A Statistical Manifesto 利用机器学习进行官方统计：统计宣言

arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04365

Marco Puts, David Salgado, Piet Daas

引用次数: 0

Modelling multivariate spatio-temporal data with identifiable variational autoencoders 用可识别变异自动编码器建立多变量时空数据模型

arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04162

Mika Sipilä, Claudia Cappello, Sandra De Iaco, Klaus Nordhausen, Sara Taskinen

{"title":"Modelling multivariate spatio-temporal data with identifiable variational autoencoders","authors":"Mika Sipilä, Claudia Cappello, Sandra De Iaco, Klaus Nordhausen, Sara Taskinen","doi":"arxiv-2409.04162","DOIUrl":"https://doi.org/arxiv-2409.04162","url":null,"abstract":"Modelling multivariate spatio-temporal data with complex dependency\u0000structures is a challenging task but can be simplified by assuming that the\u0000original variables are generated from independent latent components. If these\u0000components are found, they can be modelled univariately. Blind source\u0000separation aims to recover the latent components by estimating the unmixing\u0000transformation based on the observed data only. The current methods for\u0000spatio-temporal blind source separation are restricted to linear unmixing, and\u0000nonlinear variants have not been implemented. In this paper, we extend\u0000identifiable variational autoencoder to the nonlinear nonstationary\u0000spatio-temporal blind source separation setting and demonstrate its performance\u0000using comprehensive simulation studies. Additionally, we introduce two\u0000alternative methods for the latent dimension estimation, which is a crucial\u0000task in order to obtain the correct latent representation. Finally, we\u0000illustrate the proposed methods using a meteorological application, where we\u0000estimate the latent dimension and the latent components, interpret the\u0000components, and show how nonstationarity can be accounted and prediction\u0000accuracy can be improved by using the proposed nonlinear blind source\u0000separation method as a preprocessing method.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Average Causal Effect Estimation in DAGs with Hidden Variables: Extensions of Back-Door and Front-Door Criteria 具有隐藏变量的 DAG 中的平均因果效应估计：后门和前门标准的扩展

arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.03962

Anna Guo, Razieh Nabi

{"title":"Average Causal Effect Estimation in DAGs with Hidden Variables: Extensions of Back-Door and Front-Door Criteria","authors":"Anna Guo, Razieh Nabi","doi":"arxiv-2409.03962","DOIUrl":"https://doi.org/arxiv-2409.03962","url":null,"abstract":"The identification theory for causal effects in directed acyclic graphs\u0000(DAGs) with hidden variables is well-developed, but methods for estimating and\u0000inferring functionals beyond the g-formula remain limited. Previous studies\u0000have proposed semiparametric estimators for identifiable functionals in a broad\u0000class of DAGs with hidden variables. While demonstrating double robustness in\u0000some models, existing estimators face challenges, particularly with density\u0000estimation and numerical integration for continuous variables, and their\u0000estimates may fall outside the parameter space of the target estimand. Their\u0000asymptotic properties are also underexplored, especially when using flexible\u0000statistical and machine learning models for nuisance estimation. This study\u0000addresses these challenges by introducing novel one-step corrected plug-in and\u0000targeted minimum loss-based estimators of causal effects for a class of DAGs\u0000that extend classical back-door and front-door criteria (known as the treatment\u0000primal fixability criterion in prior literature). These estimators leverage\u0000machine learning to minimize modeling assumptions while ensuring key\u0000statistical properties such as asymptotic linearity, double robustness,\u0000efficiency, and staying within the bounds of the target parameter space. We\u0000establish conditions for nuisance functional estimates in terms of L2(P)-norms\u0000to achieve root-n consistent causal effect estimates. To facilitate practical\u0000application, we have developed the flexCausal package in R.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142225126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Incorporating external data for analyzing randomized clinical trials: A transfer learning approach 结合外部数据分析随机临床试验：迁移学习法

arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04126

Yujia Gu, Hanzhong Liu, Wei Ma

{"title":"Incorporating external data for analyzing randomized clinical trials: A transfer learning approach","authors":"Yujia Gu, Hanzhong Liu, Wei Ma","doi":"arxiv-2409.04126","DOIUrl":"https://doi.org/arxiv-2409.04126","url":null,"abstract":"Randomized clinical trials are the gold standard for analyzing treatment\u0000effects, but high costs and ethical concerns can limit recruitment, potentially\u0000leading to invalid inferences. Incorporating external trial data with similar\u0000characteristics into the analysis using transfer learning appears promising for\u0000addressing these issues. In this paper, we present a formal framework for\u0000applying transfer learning to the analysis of clinical trials, considering\u0000three key perspectives: transfer algorithm, theoretical foundation, and\u0000inference method. For the algorithm, we adopt a parameter-based transfer\u0000learning approach to enhance the lasso-adjusted stratum-specific estimator\u0000developed for estimating treatment effects. A key component in constructing the\u0000transfer learning estimator is deriving the regression coefficient estimates\u0000within each stratum, accounting for the bias between source and target data. To\u0000provide a theoretical foundation, we derive the $l_1$ convergence rate for the\u0000estimated regression coefficients and establish the asymptotic normality of the\u0000transfer learning estimator. Our results show that when external trial data\u0000resembles current trial data, the sample size requirements can be reduced\u0000compared to using only the current trial data. Finally, we propose a consistent\u0000nonparametric variance estimator to facilitate inference. Numerical studies\u0000demonstrate the effectiveness and robustness of our proposed estimator across\u0000various scenarios.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142225127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Local times of self-intersection and sample path properties of Volterra Gaussian processes Volterra 高斯过程的局部自交时间和样本路径特性

arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04377

Olga Izyumtseva, Wasiur R. KhudaBukhsh

引用次数: 0

Over-parameterized regression methods and their application to semi-supervised learning 过参数化回归方法及其在半监督学习中的应用

arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04001

Katsuyuki Hagiwara

{"title":"Over-parameterized regression methods and their application to semi-supervised learning","authors":"Katsuyuki Hagiwara","doi":"arxiv-2409.04001","DOIUrl":"https://doi.org/arxiv-2409.04001","url":null,"abstract":"The minimum norm least squares is an estimation strategy under an\u0000over-parameterized case and, in machine learning, is known as a helpful tool\u0000for understanding a nature of deep learning. In this paper, to apply it in a\u0000context of non-parametric regression problems, we established several methods\u0000which are based on thresholding of SVD (singular value decomposition)\u0000components, wihch are referred to as SVD regression methods. We considered\u0000several methods that are singular value based thresholding, hard-thresholding\u0000with cross validation, universal thresholding and bridge thresholding.\u0000Information on output samples is not utilized in the first method while it is\u0000utilized in the other methods. We then applied them to semi-supervised\u0000learning, in which unlabeled input samples are incorporated into kernel\u0000functions in a regressor. The experimental results for real data showed that,\u0000depending on the datasets, the SVD regression methods is superior to a naive\u0000ridge regression method. Unfortunately, there were no clear advantage of the\u0000methods utilizing information on output samples. Furthermore, for depending on\u0000datasets, incorporation of unlabeled input samples into kernels is found to\u0000have certain advantages.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The $infty$-S test via regression quantile affine LASSO 通过回归量子仿射 LASSO 进行 $infty$-S 检验

arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04256

Sylvain Sardy, Xiaoyu Ma, Hugo Gaible

引用次数: 0

Fitting the Discrete Swept Skeletal Representation to Slabular Objects 将离散扫掠骨骼表示法拟合到板状物体上

arXiv - STAT - Methodology Pub Date : 2024-09-06 DOI: arxiv-2409.04079

Mohsen Taheri, Stephen M. Pizer, Jörn Schulz

引用次数: 0

A tutorial on panel data analysis using partially observed Markov processes via the R package panelPomp 通过 R 软件包 panelPomp 使用部分观测马尔可夫过程进行面板数据分析的教程

arXiv - STAT - Methodology Pub Date : 2024-09-05 DOI: arxiv-2409.03876

Carles Breto, Jesse Wheeler, Aaron A. King, Edward L. Ionides

引用次数: 0