arXiv - STAT - Methodology最新文献

筛选
英文 中文
Conformal Diffusion Models for Individual Treatment Effect Estimation and Inference 用于个体治疗效果估计和推断的共形扩散模型
arXiv - STAT - Methodology Pub Date : 2024-08-02 DOI: arxiv-2408.01582
Hengrui Cai, Huaqing Jin, Lexin Li
{"title":"Conformal Diffusion Models for Individual Treatment Effect Estimation and Inference","authors":"Hengrui Cai, Huaqing Jin, Lexin Li","doi":"arxiv-2408.01582","DOIUrl":"https://doi.org/arxiv-2408.01582","url":null,"abstract":"Estimating treatment effects from observational data is of central interest\u0000across numerous application domains. Individual treatment effect offers the\u0000most granular measure of treatment effect on an individual level, and is the\u0000most useful to facilitate personalized care. However, its estimation and\u0000inference remain underdeveloped due to several challenges. In this article, we\u0000propose a novel conformal diffusion model-based approach that addresses those\u0000intricate challenges. We integrate the highly flexible diffusion modeling, the\u0000model-free statistical inference paradigm of conformal inference, along with\u0000propensity score and covariate local approximation that tackle distributional\u0000shifts. We unbiasedly estimate the distributions of potential outcomes for\u0000individual treatment effect, construct an informative confidence interval, and\u0000establish rigorous theoretical guarantees. We demonstrate the competitive\u0000performance of the proposed method over existing solutions through extensive\u0000numerical studies.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Empirical Bayes Linked Matrix Decomposition 经验贝叶斯关联矩阵分解
arXiv - STAT - Methodology Pub Date : 2024-08-01 DOI: arxiv-2408.00237
Eric F. Lock
{"title":"Empirical Bayes Linked Matrix Decomposition","authors":"Eric F. Lock","doi":"arxiv-2408.00237","DOIUrl":"https://doi.org/arxiv-2408.00237","url":null,"abstract":"Data for several applications in diverse fields can be represented as\u0000multiple matrices that are linked across rows or columns. This is particularly\u0000common in molecular biomedical research, in which multiple molecular \"omics\"\u0000technologies may capture different feature sets (e.g., corresponding to rows in\u0000a matrix) and/or different sample populations (corresponding to columns). This\u0000has motivated a large body of work on integrative matrix factorization\u0000approaches that identify and decompose low-dimensional signal that is shared\u0000across multiple matrices or specific to a given matrix. We propose an empirical\u0000variational Bayesian approach to this problem that has several advantages over\u0000existing techniques, including the flexibility to accommodate shared signal\u0000over any number of row or column sets (i.e., bidimensional integration), an\u0000intuitive model-based objective function that yields appropriate shrinkage for\u0000the inferred signals, and a relatively efficient estimation algorithm with no\u0000tuning parameters. A general result establishes conditions for the uniqueness\u0000of the underlying decomposition for a broad family of methods that includes the\u0000proposed approach. For scenarios with missing data, we describe an associated\u0000iterative imputation approach that is novel for the single-matrix context and a\u0000powerful approach for \"blockwise\" imputation (in which an entire row or column\u0000is missing) in various linked matrix contexts. Extensive simulations show that\u0000the method performs very well under different scenarios with respect to\u0000recovering underlying low-rank signal, accurately decomposing shared and\u0000specific signals, and accurately imputing missing data. The approach is applied\u0000to gene expression and miRNA data from breast cancer tissue and normal breast\u0000tissue, for which it gives an informative decomposition of variation and\u0000outperforms alternative strategies for missing data imputation.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Operator on Operator Regression in Quantum Probability 量子概率中的算子回归算子
arXiv - STAT - Methodology Pub Date : 2024-08-01 DOI: arxiv-2408.00289
Suprio Bhar, Subhra Sankar Dhar, Soumalya Joardar
{"title":"Operator on Operator Regression in Quantum Probability","authors":"Suprio Bhar, Subhra Sankar Dhar, Soumalya Joardar","doi":"arxiv-2408.00289","DOIUrl":"https://doi.org/arxiv-2408.00289","url":null,"abstract":"This article introduces operator on operator regression in quantum\u0000probability. Here in the regression model, the response and the independent\u0000variables are certain operator valued observables, and they are linearly\u0000associated with unknown scalar coefficient (denoted by $beta$), and the error\u0000is a random operator. In the course of this study, we propose a quantum version\u0000of a class of estimators (denoted by $M$ estimator) of $beta$, and the large\u0000sample behaviour of those quantum version of the estimators are derived, given\u0000the fact that the true model is also linear and the samples are observed\u0000eigenvalue pairs of the operator valued observables.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Pairwise Causal Discovery on Heterogeneous Data using Mutual Information Measures 利用互信息度量在异构数据上进行无监督配对因果发现
arXiv - STAT - Methodology Pub Date : 2024-08-01 DOI: arxiv-2408.00399
Alexandre Trilla, Nenad Mijatovic
{"title":"Unsupervised Pairwise Causal Discovery on Heterogeneous Data using Mutual Information Measures","authors":"Alexandre Trilla, Nenad Mijatovic","doi":"arxiv-2408.00399","DOIUrl":"https://doi.org/arxiv-2408.00399","url":null,"abstract":"A fundamental task in science is to determine the underlying causal relations\u0000because it is the knowledge of this functional structure what leads to the\u0000correct interpretation of an effect given the apparent associations in the\u0000observed data. In this sense, Causal Discovery is a technique that tackles this\u0000challenge by analyzing the statistical properties of the constituent variables.\u0000In this work, we target the generalizability of the discovery method by\u0000following a reductionist approach that only involves two variables, i.e., the\u0000pairwise or bi-variate setting. We question the current (possibly misleading)\u0000baseline results on the basis that they were obtained through supervised\u0000learning, which is arguably contrary to this genuinely exploratory endeavor. In\u0000consequence, we approach this problem in an unsupervised way, using robust\u0000Mutual Information measures, and observing the impact of the different variable\u0000types, which is oftentimes ignored in the design of solutions. Thus, we provide\u0000a novel set of standard unbiased results that can serve as a reference to guide\u0000future discovery tasks in completely unknown environments.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Dirichlet stochastic block model for composition-weighted networks 组成加权网络的 Dirichlet 随机块模型
arXiv - STAT - Methodology Pub Date : 2024-08-01 DOI: arxiv-2408.00651
Iuliia Promskaia, Adrian O'Hagan, Michael Fop
{"title":"A Dirichlet stochastic block model for composition-weighted networks","authors":"Iuliia Promskaia, Adrian O'Hagan, Michael Fop","doi":"arxiv-2408.00651","DOIUrl":"https://doi.org/arxiv-2408.00651","url":null,"abstract":"Network data are observed in various applications where the individual\u0000entities of the system interact with or are connected to each other, and often\u0000these interactions are defined by their associated strength or importance.\u0000Clustering is a common task in network analysis that involves finding groups of\u0000nodes displaying similarities in the way they interact with the rest of the\u0000network. However, most clustering methods use the strengths of connections\u0000between entities in their original form, ignoring the possible differences in\u0000the capacities of individual nodes to send or receive edges. This often leads\u0000to clustering solutions that are heavily influenced by the nodes' capacities.\u0000One way to overcome this is to analyse the strengths of connections in relative\u0000rather than absolute terms, expressing each edge weight as a proportion of the\u0000sending (or receiving) capacity of the respective node. This, however, induces\u0000additional modelling constraints that most existing clustering methods are not\u0000designed to handle. In this work we propose a stochastic block model for\u0000composition-weighted networks based on direct modelling of compositional weight\u0000vectors using a Dirichlet mixture, with the parameters determined by the\u0000cluster labels of the sender and the receiver nodes. Inference is implemented\u0000via an extension of the classification expectation-maximisation algorithm that\u0000uses a working independence assumption, expressing the complete data likelihood\u0000of each node of the network as a function of fixed cluster labels of the\u0000remaining nodes. A model selection criterion is derived to aid the choice of\u0000the number of clusters. The model is validated using simulation studies, and\u0000showcased on network data from the Erasmus exchange program and a bike sharing\u0000network for the city of London.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"64 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Aggregation Models with Optimal Weights for Distributed Gaussian Processes 具有分布式高斯过程最佳权重的聚合模型
arXiv - STAT - Methodology Pub Date : 2024-08-01 DOI: arxiv-2408.00955
Haoyuan Chen, Rui Tuo
{"title":"Aggregation Models with Optimal Weights for Distributed Gaussian Processes","authors":"Haoyuan Chen, Rui Tuo","doi":"arxiv-2408.00955","DOIUrl":"https://doi.org/arxiv-2408.00955","url":null,"abstract":"Gaussian process (GP) models have received increasingly attentions in recent\u0000years due to their superb prediction accuracy and modeling flexibility. To\u0000address the computational burdens of GP models for large-scale datasets,\u0000distributed learning for GPs are often adopted. Current aggregation models for\u0000distributed GPs are not time-efficient when incorporating correlations between\u0000GP experts. In this work, we propose a novel approach for aggregated prediction\u0000in distributed GPs. The technique is suitable for both the exact and sparse\u0000variational GPs. The proposed method incorporates correlations among experts,\u0000leading to better prediction accuracy with manageable computational\u0000requirements. As demonstrated by empirical studies, the proposed approach\u0000results in more stable predictions in less time than state-of-the-art\u0000consistent aggregation models.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling land use dynamics: Insights from a hierarchical Bayesian spatio-temporal modelling of Compositional Data 揭示土地利用动态:从合成数据的分层贝叶斯时空建模中获得的启示
arXiv - STAT - Methodology Pub Date : 2024-07-31 DOI: arxiv-2407.21695
Mario Figueira, Carmen Guarner, David Conesa, Antonio López-Quílez, Tamás Krisztin
{"title":"Unveiling land use dynamics: Insights from a hierarchical Bayesian spatio-temporal modelling of Compositional Data","authors":"Mario Figueira, Carmen Guarner, David Conesa, Antonio López-Quílez, Tamás Krisztin","doi":"arxiv-2407.21695","DOIUrl":"https://doi.org/arxiv-2407.21695","url":null,"abstract":"Changes in land use patterns have significant environmental and socioeconomic\u0000impacts, making it crucial for policymakers to understand their causes and\u0000consequences. This study, part of the European LAMASUS (Land Management for\u0000Sustainability) project, aims to support the EU's climate neutrality target by\u0000developing a governance model through collaboration between policymakers, land\u0000users, and researchers. We present a methodological synthesis for treating land\u0000use data using a Bayesian approach within spatial and spatio-temporal modeling\u0000frameworks. The study tackles the challenges of analyzing land use changes, particularly\u0000the presence of zero values and computational issues with large datasets. It\u0000introduces joint model structures to address zeros and employs sequential\u0000inference and consensus methods for Big Data problems. Spatial downscaling\u0000models approximate smaller scales from aggregated data, circumventing\u0000high-resolution data complications. We explore Beta regression and Compositional Data Analysis (CoDa) for land\u0000use data, review relevant spatial and spatio-temporal models, and present\u0000strategies for handling zeros. The paper demonstrates the implementation of key\u0000models, downscaling techniques, and solutions to Big Data challenges with\u0000examples from simulated data and the LAMASUS project, providing a comprehensive\u0000framework for understanding and managing land use changes.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"78 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141870635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An overview of methods for receiver operating characteristic analysis, with an application to SARS-CoV-2 vaccine-induced humoral responses in solid organ transplant recipients 受体操作特征分析方法概述,并应用于 SARS-CoV-2 疫苗诱导的实体器官移植受者的体液反应
arXiv - STAT - Methodology Pub Date : 2024-07-31 DOI: arxiv-2407.21253
Nathaniel P. Dowd, Bryan Blette, James D. Chappell, Natasha B. Halasa, Andrew J. Spieker
{"title":"An overview of methods for receiver operating characteristic analysis, with an application to SARS-CoV-2 vaccine-induced humoral responses in solid organ transplant recipients","authors":"Nathaniel P. Dowd, Bryan Blette, James D. Chappell, Natasha B. Halasa, Andrew J. Spieker","doi":"arxiv-2407.21253","DOIUrl":"https://doi.org/arxiv-2407.21253","url":null,"abstract":"Receiver operating characteristic (ROC) analysis is a tool to evaluate the\u0000capacity of a numeric measure to distinguish between groups, often employed in\u0000the evaluation of diagnostic tests. Overall classification ability is sometimes\u0000crudely summarized by a single numeric measure such as the area under the\u0000empirical ROC curve. However, it may also be of interest to estimate the full\u0000ROC curve while leveraging assumptions regarding the nature of the data\u0000(parametric) or about the ROC curve directly (semiparametric). Although there\u0000has been recent interest in methods to conduct comparisons by way of stochastic\u0000ordering, nuances surrounding ROC geometry and estimation are not widely known\u0000in the broader scientific and statistical community. The overarching goals of\u0000this manuscript are to (1) provide an overview of existing frameworks for ROC\u0000curve estimation with examples, (2) offer intuition for and considerations\u0000regarding methodological trade-offs, and (3) supply sample R code to guide\u0000implementation. We utilize simulations to demonstrate the bias-variance\u0000trade-off across various methods. As an illustrative example, we analyze data\u0000from a recent cohort study in order to compare responses to SARS-CoV-2\u0000vaccination between solid organ transplant recipients and healthy controls.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141870640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Fréchet Regression 深度弗雷谢特回归
arXiv - STAT - Methodology Pub Date : 2024-07-31 DOI: arxiv-2407.21407
Su I Iao, Yidong Zhou, Hans-Georg Müller
{"title":"Deep Fréchet Regression","authors":"Su I Iao, Yidong Zhou, Hans-Georg Müller","doi":"arxiv-2407.21407","DOIUrl":"https://doi.org/arxiv-2407.21407","url":null,"abstract":"Advancements in modern science have led to the increasing availability of\u0000non-Euclidean data in metric spaces. This paper addresses the challenge of\u0000modeling relationships between non-Euclidean responses and multivariate\u0000Euclidean predictors. We propose a flexible regression model capable of\u0000handling high-dimensional predictors without imposing parametric assumptions.\u0000Two primary challenges are addressed: the curse of dimensionality in\u0000nonparametric regression and the absence of linear structure in general metric\u0000spaces. The former is tackled using deep neural networks, while for the latter\u0000we demonstrate the feasibility of mapping the metric space where responses\u0000reside to a low-dimensional Euclidean space using manifold learning. We\u0000introduce a reverse mapping approach, employing local Fr'echet regression, to\u0000map the low-dimensional manifold representations back to objects in the\u0000original metric space. We develop a theoretical framework, investigating the\u0000convergence rate of deep neural networks under dependent sub-Gaussian noise\u0000with bias. The convergence rate of the proposed regression model is then\u0000obtained by expanding the scope of local Fr'echet regression to accommodate\u0000multivariate predictors in the presence of errors in predictors. Simulations\u0000and case studies show that the proposed model outperforms existing methods for\u0000non-Euclidean responses, focusing on the special cases of probability measures\u0000and networks.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"242 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141870638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Industrial-Grade Smart Troubleshooting through Causal Technical Language Processing: a Proof of Concept 通过因果技术语言处理实现工业级智能故障排除:概念验证
arXiv - STAT - Methodology Pub Date : 2024-07-30 DOI: arxiv-2407.20700
Alexandre Trilla, Ossee Yiboe, Nenad Mijatovic, Jordi Vitrià
{"title":"Industrial-Grade Smart Troubleshooting through Causal Technical Language Processing: a Proof of Concept","authors":"Alexandre Trilla, Ossee Yiboe, Nenad Mijatovic, Jordi Vitrià","doi":"arxiv-2407.20700","DOIUrl":"https://doi.org/arxiv-2407.20700","url":null,"abstract":"This paper describes the development of a causal diagnosis approach for\u0000troubleshooting an industrial environment on the basis of the technical\u0000language expressed in Return on Experience records. The proposed method\u0000leverages the vectorized linguistic knowledge contained in the distributed\u0000representation of a Large Language Model, and the causal associations entailed\u0000by the embedded failure modes and mechanisms of the industrial assets. The\u0000paper presents the elementary but essential concepts of the solution, which is\u0000conceived as a causality-aware retrieval augmented generation system, and\u0000illustrates them experimentally on a real-world Predictive Maintenance setting.\u0000Finally, it discusses avenues of improvement for the maturity of the utilized\u0000causal technology to meet the robustness challenges of increasingly complex\u0000scenarios in the industry.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"213 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141870435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信