Computational Statistics & Data Analysis最新文献_第10页

Multi-task optimization with Bayesian neural network surrogates for parameter estimation of a simulation model 利用贝叶斯神经网络代理进行多任务优化，以估算仿真模型参数

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-11-22 DOI: 10.1016/j.csda.2024.108097

Hyungjin Kim , Chuljin Park , Heeyoung Kim

{"title":"Multi-task optimization with Bayesian neural network surrogates for parameter estimation of a simulation model","authors":"Hyungjin Kim , Chuljin Park , Heeyoung Kim","doi":"10.1016/j.csda.2024.108097","DOIUrl":"10.1016/j.csda.2024.108097","url":null,"abstract":"<div><div>We propose a novel framework for efficient parameter estimation in simulation models, formulated as an optimization problem that minimizes the discrepancy between physical system observations and simulation model outputs. Our framework, called multi-task optimization with Bayesian neural network surrogates (MOBS), is designed for scenarios that require the simultaneous estimation of multiple sets of parameters, each set corresponding to a distinct set of observations, while also enabling fast parameter estimation essential for real-time process monitoring and control. MOBS integrates a heuristic search algorithm, utilizing a single-layer Bayesian neural network surrogate model trained on an initial simulation dataset. This surrogate model is shared across multiple tasks to select and evaluate candidate parameter values, facilitating efficient multi-task optimization. We provide a closed-form parameter screening rule and demonstrate that the expected number of simulation runs converges to a user-specified threshold. Our framework was applied to a numerical example and a semiconductor manufacturing case study, significantly reducing computational costs while achieving accurate parameter estimation.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"204 ","pages":"Article 108097"},"PeriodicalIF":1.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimal sequential detection by sparsity likelihood 利用稀疏似然法优化顺序检测

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-11-17 DOI: 10.1016/j.csda.2024.108089

Jingyan Huang, Hock Peng Chan

引用次数: 0

Inference for the stochastic FitzHugh-Nagumo model from real action potential data via approximate Bayesian computation 通过近似贝叶斯计算从真实动作电位数据推断随机菲茨休-纳古莫模型

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-11-15 DOI: 10.1016/j.csda.2024.108095

Adeline Samson , Massimiliano Tamborrino , Irene Tubikanec

{"title":"Inference for the stochastic FitzHugh-Nagumo model from real action potential data via approximate Bayesian computation","authors":"Adeline Samson , Massimiliano Tamborrino , Irene Tubikanec","doi":"10.1016/j.csda.2024.108095","DOIUrl":"10.1016/j.csda.2024.108095","url":null,"abstract":"<div><div>The stochastic FitzHugh-Nagumo (FHN) model is a two-dimensional nonlinear stochastic differential equation with additive degenerate noise, whose first component, the only one observed, describes the membrane voltage evolution of a single neuron. Due to its low-dimensionality, its analytical and numerical tractability and its neuronal interpretation, it has been used as a case study to test the performance of different statistical methods in estimating the underlying model parameters. Existing methods, however, often require complete observations, non-degeneracy of the noise or a complex architecture (e.g., to estimate the transition density of the process, ‘‘recovering’’ the unobserved second component) and they may not (satisfactorily) estimate all model parameters simultaneously. Moreover, these studies lack real data applications for the stochastic FHN model. The proposed method tackles all challenges (non-globally Lipschitz drift, non-explicit solution, lack of available transition density, degeneracy of the noise and partial observations). It is an intuitive and easy-to-implement sequential Monte Carlo approximate Bayesian computation algorithm, which relies on a recent computationally efficient and structure-preserving numerical splitting scheme for synthetic data generation and on summary statistics exploiting the structural properties of the process. All model parameters are successfully estimated from simulated data and, more remarkably, real action potential data of rats. The presented novel real-data fit may broaden the scope and credibility of this classic and widely used neuronal model.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"204 ","pages":"Article 108095"},"PeriodicalIF":1.5,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

High-dimensional copula-based Wasserstein dependence 基于 Wasserstein 依赖性的高维协程

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-11-15 DOI: 10.1016/j.csda.2024.108096

Steven De Keyser, Irène Gijbels

引用次数: 0

Efficient Bayesian functional principal component analysis of irregularly-observed multivariate curves 对不规则多变量曲线进行高效的贝叶斯函数主成分分析

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-11-12 DOI: 10.1016/j.csda.2024.108094

Tui H. Nolan , Sylvia Richardson , Hélène Ruffieux

{"title":"Efficient Bayesian functional principal component analysis of irregularly-observed multivariate curves","authors":"Tui H. Nolan , Sylvia Richardson , Hélène Ruffieux","doi":"10.1016/j.csda.2024.108094","DOIUrl":"10.1016/j.csda.2024.108094","url":null,"abstract":"<div><div>The analysis of multivariate functional curves has the potential to yield important scientific discoveries in domains such as healthcare, medicine, economics and social sciences. However, it is common for real-world settings to present longitudinal data that are both irregularly and sparsely observed, which introduces important challenges for the current functional data methodology. A Bayesian hierarchical framework for multivariate functional principal component analysis is proposed, which accommodates the intricacies of such irregular observation settings by flexibly pooling information across subjects and correlated curves. The model represents common latent dynamics via shared functional principal component scores, thereby effectively borrowing strength across curves while circumventing the computationally challenging task of estimating covariance matrices. These scores also provide a parsimonious representation of the major modes of joint variation of the curves and constitute interpretable scalar summaries that can be employed in follow-up analyses. Estimation is conducted using variational inference, ensuring that accurate posterior approximation and robust uncertainty quantification are achieved. The algorithm also introduces a novel variational message passing fragment for multivariate functional principal component Gaussian likelihood that enables modularity and reuse across models. Detailed simulations assess the effectiveness of the approach in sharing information from sparse and irregularly sampled multivariate curves. The methodology is also exploited to estimate the molecular disease courses of individual patients with SARS-CoV-2 infection and characterise patient heterogeneity in recovery outcomes; this study reveals key coordinated dynamics across the immune, inflammatory and metabolic systems, which are associated with long-COVID symptoms up to one year post disease onset. The approach is implemented in the R package <span>bayesFPCA</span>.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"203 ","pages":"Article 108094"},"PeriodicalIF":1.5,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Dirichlet process model for directional-linear data with application to bloodstain pattern analysis 应用于血迹模式分析的定向线性数据的狄利克特过程模型

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-11-12 DOI: 10.1016/j.csda.2024.108093

Tong Zou, Hal S. Stern

{"title":"A Dirichlet process model for directional-linear data with application to bloodstain pattern analysis","authors":"Tong Zou, Hal S. Stern","doi":"10.1016/j.csda.2024.108093","DOIUrl":"10.1016/j.csda.2024.108093","url":null,"abstract":"<div><div>Directional data require specialized models because of the non-Euclidean nature of their domain. When a directional variable is observed jointly with linear variables, modeling their dependence adds an additional layer of complexity. A Bayesian nonparametric approach is introduced to analyze directional-linear data. Firstly, the projected normal distribution is extended to model the joint distribution of linear variables and a directional variable with arbitrary dimension projected from a higher-dimensional augmented multivariate normal distribution. The new distribution is called the semi-projected normal distribution (SPN) and can be used as the mixture distribution in a Dirichlet process model to obtain a more flexible class of models for directional-linear data. Then, a conditional inverse-Wishart distribution is proposed as part of the prior distribution to address an identifiability issue inherited from the projected normal and preserve conjugacy with the SPN. The SPN mixture model shows superior performance in clustering on synthetic data compared to the semi-wrapped Gaussian model. The experiments show the ability of the SPN mixture model to characterize bloodstain patterns. A hierarchical Dirichlet process model with the SPN distribution is built to estimate the likelihood of bloodstain patterns under a posited causal mechanism for use in a likelihood ratio approach to the analysis of forensic bloodstain pattern evidence.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"204 ","pages":"Article 108093"},"PeriodicalIF":1.5,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Lost in the shuffle: Testing power in the presence of errorful network vertex labels 在洗牌中迷失：在网络顶点标签错误的情况下测试功率

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-11-12 DOI: 10.1016/j.csda.2024.108091

Ayushi Saxena, Vince Lyzinski

{"title":"Lost in the shuffle: Testing power in the presence of errorful network vertex labels","authors":"Ayushi Saxena, Vince Lyzinski","doi":"10.1016/j.csda.2024.108091","DOIUrl":"10.1016/j.csda.2024.108091","url":null,"abstract":"<div><div>Two-sample network hypothesis testing is an important inference task with applications across diverse fields such as medicine, neuroscience, and sociology. Many of these testing methodologies operate under the implicit assumption that the vertex correspondence across networks is a priori known. This assumption is often untrue, and the power of the subsequent test can degrade when there are misaligned/label-shuffled vertices across networks. This power loss due to shuffling is theoretically explored in the context of random dot product and stochastic block model networks for a pair of hypothesis tests based on Frobenius norm differences between estimated edge probability matrices or between adjacency matrices. The loss in testing power is further reinforced by numerous simulations and experiments, both in the stochastic block model and in the random dot product graph model, where the power loss across multiple recently proposed tests in the literature is considered. Lastly, the impact that shuffling can have in real-data testing is demonstrated in a pair of examples from neuroscience and from social network analysis.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"204 ","pages":"Article 108091"},"PeriodicalIF":1.5,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Statistical modeling of Dengue transmission dynamics with environmental factors 利用环境因素建立登革热传播动态统计模型

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-11-08 DOI: 10.1016/j.csda.2024.108080

Lengyang Wang , Mingke Zhang

引用次数: 0

Analysis of order-of-addition experiments 阶次添加实验分析

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-11-06 DOI: 10.1016/j.csda.2024.108077

Xueru Zhang , Dennis K.J. Lin , Min-Qian Liu , Jianbin Chen

{"title":"Analysis of order-of-addition experiments","authors":"Xueru Zhang , Dennis K.J. Lin , Min-Qian Liu , Jianbin Chen","doi":"10.1016/j.csda.2024.108077","DOIUrl":"10.1016/j.csda.2024.108077","url":null,"abstract":"<div><div>The order-of-addition (OofA) experiment involves arranging components in a specific order to optimize a certain objective, which is attracting a great deal of attention in many disciplines, especially in the areas of biochemistry, scheduling, and engineering. Recent studies have highlighted its significance, and notable works have aimed to address NP-hard OofA problems from a statistical perspective. However, solving OofA problems presents challenges due to their complex nature and the presence of uncertainty, such as scheduling problems with uncertain processing times. These uncertainties affect processing times, which are not known with certainty in advance. They introduce heteroscedasticity into OofA experiments, where different orders result in varying dispersions. To address these challenges, a unified framework is proposed to analyze scheduling problems without making specific assumptions about the distribution of these certainties. It encompasses model development and optimization, encapsulating existing homoscedastic studies (where different orders produce the same dispersion value) as a specific instance. For heteroscedastic cases, a dual response optimization within an uncertainty set is proposed, aiming to minimize the dispersion of response while keeping the location of response with a predefined target value. However, solving the proposed non-linear minimax optimization is rather challenging. An equivalent optimization formulation with low computational cost is proposed for solving such a challenging problem. Theoretical supports are established to ensure the tractability of the proposed method. Simulation studies are conducted to demonstrate the effectiveness of the proposed approach. With its solid theoretical support, ease of implementation, and ability to find an optimal order, the proposed approach offers a practical and competitive solution to solving general order-of-addition problems.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"203 ","pages":"Article 108077"},"PeriodicalIF":1.5,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A goodness-of-fit test for functional time series with applications to Ornstein-Uhlenbeck processes 功能时间序列的拟合优度检验及其在 Ornstein-Uhlenbeck 过程中的应用

IF 1.5 3区数学

Computational Statistics & Data Analysis Pub Date : 2024-11-05 DOI: 10.1016/j.csda.2024.108092

J. Álvarez-Liébana , A. López-Pérez , W. González-Manteiga , M. Febrero-Bande

{"title":"A goodness-of-fit test for functional time series with applications to Ornstein-Uhlenbeck processes","authors":"J. Álvarez-Liébana , A. López-Pérez , W. González-Manteiga , M. Febrero-Bande","doi":"10.1016/j.csda.2024.108092","DOIUrl":"10.1016/j.csda.2024.108092","url":null,"abstract":"<div><div>High-frequency financial data can be collected as a sequence of time-ordered curves, such as intraday prices. The Functional Data Analysis (FDA) framework offers a powerful approach to uncover information embedded in the shape of the daily paths, often unavailable from classical statistical methods. A novel goodness-of-fit test for autoregressive Hilbertian (ARH) models is introduced, imposing only the Hilbert-Schmidt condition on the autocorrelation operator. The test statistic is formulated in terms of a Cramér–von Mises norm, with calibration achieved via a wild bootstrap resampling procedure. A simulation study examines the test's finite-sample performance in terms of power and size. Furthermore, a new specification test for diffusion models, including Ornstein-Uhlenbeck processes, is proposed, illustrated with an application to intraday currency exchange rates. Specifically, a two-stage methodology is proffered: firstly, the relationship between functional samples and their lagged values is assessed using an ARH(1) model; second, under linearity, a functional F-test is conducted.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"203 ","pages":"Article 108092"},"PeriodicalIF":1.5,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0