{"title":"Poisson approximate likelihood compared to the particle filter","authors":"Yize Hao, Aaron A. Abkemeier, Edward L. Ionides","doi":"arxiv-2409.12173","DOIUrl":"https://doi.org/arxiv-2409.12173","url":null,"abstract":"Filtering algorithms are fundamental for inference on partially observed\u0000stochastic dynamic systems, since they provide access to the likelihood\u0000function and hence enable likelihood-based or Bayesian inference. A novel\u0000Poisson approximate likelihood (PAL) filter was introduced by Whitehouse et al.\u0000(2023). PAL employs a Poisson approximation to conditional densities, offering\u0000a fast approximation to the likelihood function for a certain subset of\u0000partially observed Markov process models. A central piece of evidence for PAL\u0000is the comparison in Table 1 of Whitehouse et al. (2023), which claims a large\u0000improvement for PAL over a standard particle filter algorithm. This evidence,\u0000based on a model and data from a previous scientific study by Stocks et al.\u0000(2020), might suggest that researchers confronted with similar models should\u0000use PAL rather than particle filter methods. Taken at face value, this evidence\u0000also reduces the credibility of Stocks et al. (2020) by indicating a\u0000shortcoming with the numerical methods that they used. However, we show that\u0000the comparison of log-likelihood values made by Whitehouse et al. (2023) is\u0000flawed because their PAL calculations were carried out using a dataset scaled\u0000differently from the previous study. If PAL and the particle filter are applied\u0000to the same data, the advantage claimed for PAL disappears. On simulations\u0000where the model is correctly specified, the particle filter outperforms PAL.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimising the Trade-Off Between Type I and Type II Errors: A Review and Extensions","authors":"Andrew P Grieve","doi":"arxiv-2409.12081","DOIUrl":"https://doi.org/arxiv-2409.12081","url":null,"abstract":"In clinical studies upon which decisions are based there are two types of\u0000errors that can be made: a type I error arises when the decision is taken to\u0000declare a positive outcome when the truth is in fact negative, and a type II\u0000error arises when the decision is taken to declare a negative outcome when the\u0000truth is in fact positive. Commonly the primary analysis of such a study\u0000entails a two-sided hypothesis test with a type I error rate of 5% and the\u0000study is designed to have a sufficiently low type II error rate, for example\u000010% or 20%. These values are arbitrary and often do not reflect the clinical,\u0000or regulatory, context of the study and ignore both the relative costs of\u0000making either type of error and the sponsor's prior belief that the drug is\u0000superior to either placebo, or a standard of care if relevant. This simplistic\u0000approach has recently been challenged by numerous authors both from a\u0000frequentist and Bayesian perspective since when resources are constrained there\u0000will be a need to consider a trade-off between type I and type II errors. In\u0000this paper we review proposals to utilise the trade-off by formally\u0000acknowledging the costs to optimise the choice of error rates for simple, point\u0000null and alternative hypotheses and extend the results to composite, or\u0000interval hypotheses, showing links to the Probability of Success of a clinical\u0000study.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"123 14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bias Reduction in Matched Observational Studies with Continuous Treatments: Calipered Non-Bipartite Matching and Bias-Corrected Estimation and Inference","authors":"Anthony Frazier, Siyu Heng, Wen Zhou","doi":"arxiv-2409.11701","DOIUrl":"https://doi.org/arxiv-2409.11701","url":null,"abstract":"Matching is a commonly used causal inference framework in observational\u0000studies. By pairing individuals with different treatment values but with the\u0000same values of covariates (i.e., exact matching), the sample average treatment\u0000effect (SATE) can be consistently estimated and inferred using the classic\u0000Neyman-type (difference-in-means) estimator and confidence interval. However,\u0000inexact matching typically exists in practice and may cause substantial bias\u0000for the downstream treatment effect estimation and inference. Many methods have\u0000been proposed to reduce bias due to inexact matching in the binary treatment\u0000case. However, to our knowledge, no existing work has systematically\u0000investigated bias due to inexact matching in the continuous treatment case. To\u0000fill this blank, we propose a general framework for reducing bias in inexactly\u0000matched observational studies with continuous treatments. In the matching\u0000stage, we propose a carefully formulated caliper that incorporates the\u0000information of both the paired covariates and treatment doses to better tailor\u0000matching for the downstream SATE estimation and inference. In the estimation\u0000and inference stage, we propose a bias-corrected Neyman estimator paired with\u0000the corresponding bias-corrected variance estimator to leverage the information\u0000on propensity density discrepancies after inexact matching to further reduce\u0000the bias due to inexact matching. We apply our proposed framework to COVID-19\u0000social mobility data to showcase differences between classic and bias-corrected\u0000SATE estimation and inference.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Forecasting age distribution of life-table death counts via α-transformation","authors":"Han Lin Shang, Steven Haberman","doi":"arxiv-2409.11658","DOIUrl":"https://doi.org/arxiv-2409.11658","url":null,"abstract":"We introduce a compositional power transformation, known as an\u0000{alpha}-transformation, to model and forecast a time series of life-table\u0000death counts, possibly with zero counts observed at older ages. As a\u0000generalisation of the isometric log-ratio transformation (i.e., {alpha} = 0),\u0000the {alpha} transformation relies on the tuning parameter {alpha}, which can\u0000be determined in a data-driven manner. Using the Australian age-specific period\u0000life-table death counts from 1921 to 2020, the {alpha} transformation can\u0000produce more accurate short-term point and interval forecasts than the\u0000log-ratio transformation. The improved forecast accuracy of life-table death\u0000counts is of great importance to demographers and government planners for\u0000estimating survival probabilities and life expectancy and actuaries for\u0000determining annuity prices and reserves for various initial ages and maturity\u0000terms.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"E-Values for Exponential Families: the General Case","authors":"Yunda Hao, Peter Grünwald","doi":"arxiv-2409.11134","DOIUrl":"https://doi.org/arxiv-2409.11134","url":null,"abstract":"We analyze common types of e-variables and e-processes for composite\u0000exponential family nulls: the optimal e-variable based on the reverse\u0000information projection (RIPr), the conditional (COND) e-variable, and the\u0000universal inference (UI) and sequen-tialized RIPr e-processes. We characterize\u0000the RIPr prior for simple and Bayes-mixture based alternatives, either\u0000precisely (for Gaussian nulls and alternatives) or in an approximate sense\u0000(general exponential families). We provide conditions under which the RIPr\u0000e-variable is (again exactly vs. approximately) equal to the COND e-variable.\u0000Based on these and other interrelations which we establish, we determine the\u0000e-power of the four e-statistics as a function of sample size, exactly for\u0000Gaussian and up to $o(1)$ in general. For $d$-dimensional null and alternative,\u0000the e-power of UI tends to be smaller by a term of $(d/2) log n + O(1)$ than\u0000that of the COND e-variable, which is the clear winner.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cointegrated Matrix Autoregression Models","authors":"Zebang Li, Han Xiao","doi":"arxiv-2409.10860","DOIUrl":"https://doi.org/arxiv-2409.10860","url":null,"abstract":"We propose a novel cointegrated autoregressive model for matrix-valued time\u0000series, with bi-linear cointegrating vectors corresponding to the rows and\u0000columns of the matrix data. Compared to the traditional cointegration analysis,\u0000our proposed matrix cointegration model better preserves the inherent structure\u0000of the data and enables corresponding interpretations. To estimate the\u0000cointegrating vectors as well as other coefficients, we introduce two types of\u0000estimators based on least squares and maximum likelihood. We investigate the\u0000asymptotic properties of the cointegrated matrix autoregressive model under the\u0000existence of trend and establish the asymptotic distributions for the\u0000cointegrating vectors, as well as other model parameters. We conduct extensive\u0000simulations to demonstrate its superior performance over traditional methods.\u0000In addition, we apply our proposed model to Fama-French portfolios and develop\u0000a effective pairs trading strategy.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucas Kock, G. S. Rodrigues, Scott A. Sisson, Nadja Klein, David J. Nott
{"title":"Calibrated Multivariate Regression with Localized PIT Mappings","authors":"Lucas Kock, G. S. Rodrigues, Scott A. Sisson, Nadja Klein, David J. Nott","doi":"arxiv-2409.10855","DOIUrl":"https://doi.org/arxiv-2409.10855","url":null,"abstract":"Calibration ensures that predicted uncertainties align with observed\u0000uncertainties. While there is an extensive literature on recalibration methods\u0000for univariate probabilistic forecasts, work on calibration for multivariate\u0000forecasts is much more limited. This paper introduces a novel post-hoc\u0000recalibration approach that addresses multivariate calibration for potentially\u0000misspecified models. Our method involves constructing local mappings between\u0000vectors of marginal probability integral transform values and the space of\u0000observations, providing a flexible and model free solution applicable to\u0000continuous, discrete, and mixed responses. We present two versions of our\u0000approach: one uses K-nearest neighbors, and the other uses normalizing flows.\u0000Each method has its own strengths in different situations. We demonstrate the\u0000effectiveness of our approach on two real data applications: recalibrating a\u0000deep neural network's currency exchange rate forecast and improving a\u0000regression model for childhood malnutrition in India for which the multivariate\u0000response has both discrete and continuous components.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of g-estimation approaches for handling symptomatic medication at multiple timepoints in Alzheimer's Disease with a hypothetical strategy","authors":"Florian Lasch, Lorenzo Guizzaro, Wen Wei Loh","doi":"arxiv-2409.10943","DOIUrl":"https://doi.org/arxiv-2409.10943","url":null,"abstract":"For handling intercurrent events in clinical trials, one of the strategies\u0000outlined in the ICH E9(R1) addendum targets the hypothetical scenario of\u0000non-occurrence of the intercurrent event. While this strategy is often\u0000implemented by setting data after the intercurrent event to missing even if\u0000they have been collected, g-estimation allows for a more efficient estimation\u0000by using the information contained in post-IE data. As the g-estimation methods\u0000have largely developed outside of randomised clinical trials, optimisations for\u0000the application in clinical trials are possible. In this work, we describe and\u0000investigate the performance of modifications to the established g-estimation\u0000methods, leveraging the assumption that some intercurrent events are expected\u0000to have the same impact on the outcome regardless of the timing of their\u0000occurrence. In a simulation study in Alzheimer disease, the modifications show\u0000a substantial efficiency advantage for the estimation of an estimand that\u0000applies the hypothetical strategy to the use of symptomatic treatment while\u0000retaining unbiasedness and adequate type I error control.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Phillip Gräfensteiner, Markus Osenberg, André Hilger, Nicole Bohn, Joachim R. Binder, Ingo Manke, Volker Schmidt, Matthias Neumann
{"title":"Data-driven stochastic 3D modeling of the nanoporous binder-conductive additive phase in battery cathodes","authors":"Phillip Gräfensteiner, Markus Osenberg, André Hilger, Nicole Bohn, Joachim R. Binder, Ingo Manke, Volker Schmidt, Matthias Neumann","doi":"arxiv-2409.11080","DOIUrl":"https://doi.org/arxiv-2409.11080","url":null,"abstract":"A stochastic 3D modeling approach for the nanoporous binder-conductive\u0000additive phase in hierarchically structured cathodes of lithium-ion batteries\u0000is presented. The binder-conductive additive phase of these electrodes consists\u0000of carbon black, polyvinylidene difluoride binder and graphite particles. For\u0000its stochastic 3D modeling, a three-step procedure based on methods from\u0000stochastic geometry is used. First, the graphite particles are described by a\u0000Boolean model with ellipsoidal grains. Second, the mixture of carbon black and\u0000binder is modeled by an excursion set of a Gaussian random field in the\u0000complement of the graphite particles. Third, large pore regions within the\u0000mixture of carbon black and binder are described by a Boolean model with\u0000spherical grains. The model parameters are calibrated to 3D image data of\u0000cathodes in lithium-ion batteries acquired by focused ion beam scanning\u0000electron microscopy. Subsequently, model validation is performed by comparing\u0000model realizations with measured image data in terms of various morphological\u0000descriptors that are not used for model fitting. Finally, we use the stochastic\u00003D model for predictive simulations, where we generate virtual, yet realistic,\u0000image data of nanoporous binder-conductive additives with varying amounts of\u0000graphite particles. Based on these virtual nanostructures, we can investigate\u0000structure-property relationships. In particular, we quantitatively study the\u0000influence of graphite particles on effective transport properties in the\u0000nanoporous binder-conductive additive phase, which have a crucial impact on\u0000electrochemical processes in the cathode and thus on the performance of battery\u0000cells.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"187 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chasing Shadows: How Implausible Assumptions Skew Our Understanding of Causal Estimands","authors":"Stijn Vansteelandt, Kelly Van Lancker","doi":"arxiv-2409.11162","DOIUrl":"https://doi.org/arxiv-2409.11162","url":null,"abstract":"The ICH E9 (R1) addendum on estimands, coupled with recent advancements in\u0000causal inference, has prompted a shift towards using model-free treatment\u0000effect estimands that are more closely aligned with the underlying scientific\u0000question. This represents a departure from traditional, model-dependent\u0000approaches where the statistical model often overshadows the inquiry itself.\u0000While this shift is a positive development, it has unintentionally led to the\u0000prioritization of an estimand's theoretical appeal over its practical\u0000learnability from data under plausible assumptions. We illustrate this by\u0000scrutinizing assumptions in the recent clinical trials literature on principal\u0000stratum estimands, demonstrating that some popular assumptions are not only\u0000implausible but often inevitably violated. We advocate for a more balanced\u0000approach to estimand formulation, one that carefully considers both the\u0000scientific relevance and the practical feasibility of estimation under\u0000realistic conditions.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}