{"title":"Approximate Confidence Distribution Computing","authors":"S. Thornton, Wentao Li, Min‐ge Xie","doi":"10.51387/23-nejsds38","DOIUrl":"https://doi.org/10.51387/23-nejsds38","url":null,"abstract":"Approximate confidence distribution computing (ACDC) offers a new take on the rapidly developing field of likelihood-free inference from within a frequentist framework. The appeal of this computational method for statistical inference hinges upon the concept of a confidence distribution, a special type of estimator which is defined with respect to the repeated sampling principle. An ACDC method provides frequentist validation for computational inference in problems with unknown or intractable likelihoods. The main theoretical contribution of this work is the identification of a matching condition necessary for frequentist validity of inference from this method. In addition to providing an example of how a modern understanding of confidence distribution theory can be used to connect Bayesian and frequentist inferential paradigms, we present a case to expand the current scope of so-called approximate Bayesian inference to include non-Bayesian inference by targeting a confidence distribution rather than a posterior. The main practical contribution of this work is the development of a data-driven approach to drive ACDC in both Bayesian or frequentist contexts. The ACDC algorithm is data-driven by the selection of a data-dependent proposal function, the structure of which is quite general and adaptable to many settings. We explore three numerical examples that both verify the theoretical arguments in the development of ACDC and suggest instances in which ACDC outperform approximate Bayesian computing methods computationally.","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89760848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable Marginalization of Correlated Latent Variables with Applications to Learning Particle Interaction Kernels","authors":"Mengyang Gu, Xubo Liu, X. Fang, Sui Tang","doi":"10.51387/22-nejsds13","DOIUrl":"https://doi.org/10.51387/22-nejsds13","url":null,"abstract":"Marginalization of latent variables or nuisance parameters is a fundamental aspect of Bayesian inference and uncertainty quantification. In this work, we focus on scalable marginalization of latent variables in modeling correlated data, such as spatio-temporal or functional observations. We first introduce Gaussian processes (GPs) for modeling correlated data and highlight the computational challenge, where the computational complexity increases cubically fast along with the number of observations. We then review the connection between the state space model and GPs with Matérn covariance for temporal inputs. The Kalman filter and Rauch-Tung-Striebel smoother were introduced as a scalable marginalization technique for computing the likelihood and making predictions of GPs without approximation. We introduce recent efforts on extending the scalable marginalization idea to the linear model of coregionalization for multivariate correlated output and spatio-temporal observations. In the final part of this work, we introduce a novel marginalization technique to estimate interaction kernels and forecast particle trajectories. The computational progress lies in the sparse representation of the inverse covariance matrix of the latent variables, then applying conjugate gradient for improving predictive accuracy with large data sets. The computational advances achieved in this work outline a wide range of applications in molecular dynamic simulation, cellular migration, and agent-based models.","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87968223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comment on “Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw Your Kidstogram,” by Xiao-Li Meng","authors":"T. Junk","doi":"10.51387/22-nejsds6b","DOIUrl":"https://doi.org/10.51387/22-nejsds6b","url":null,"abstract":"This contribution is a series of comments on Prof. Xiao-Li Meng’s article, “Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw Your Kidstogram”. Prof. Meng’s article offers some radical proposals and not-so-radical proposals to improve the quality of statistical inference used in the sciences and also to extend distributional thinking to early education. Discussions and alternative proposals are presented.","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"1997 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82485810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comments on Xiao-Li Meng’s Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw Your Kidstogram","authors":"D. Lin","doi":"10.51387/23-nejsds6e","DOIUrl":"https://doi.org/10.51387/23-nejsds6e","url":null,"abstract":"","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"633 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78985207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Four Types of Frequentism and Their Interplay with Bayesianism","authors":"James O. Berger","doi":"10.51387/22-nejsds4","DOIUrl":"https://doi.org/10.51387/22-nejsds4","url":null,"abstract":"","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"20 6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83470093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw your Kidstrogram","authors":"X. Meng","doi":"10.51387/22-nejsds6","DOIUrl":"https://doi.org/10.51387/22-nejsds6","url":null,"abstract":"This article expands upon my presentation to the panel on “The Radical Prescription for Change” at the 2017 ASA (American Statistical Association) symposium on A World Beyond $p<0.05$. It emphasizes that, to greatly enhance the reliability of—and hence public trust in—statistical and data scientific findings, we need to take a holistic approach. We need to lead by example, incentivize study quality, and inoculate future generations with profound appreciations for the world of uncertainty and the uncertainty world. The four “radical” proposals in the title—with all their inherent defects and trade-offs—are designed to provoke reactions and actions. First, research methodologies are trustworthy only if they deliver what they promise, even if this means that they have to be overly protective, a necessary trade-off for practicing quality-guaranteed statistics. This guiding principle may compel us to doubling variance in some situations, a strategy that also coincides with the call to raise the bar from $p<0.05$ to $p<0.005$ [3]. Second, teaching principled practicality or corner-cutting is a promising strategy to enhance the scientific community’s as well as the general public’s ability to spot—and hence to deter—flawed arguments or findings. A remarkable quick-and-dirty Bayes formula for rare events, which simply divides the prevalence by the sum of the prevalence and the false positive rate (or the total error rate), as featured by the popular radio show Car Talk, illustrates the effectiveness of this strategy. Third, it should be a routine mental exercise to put ourselves in the shoes of those who would be affected by our research finding, in order to combat the tendency of rushing to conclusions or overstating confidence in our findings. A pufferfish/selfish test can serve as an effective reminder, and can help to institute the mantra “Thou shalt not sell what thou refuseth to buy” as the most basic professional decency. Considering personal stakes in our statistical endeavors also points to the concept of behavioral statistics, in the spirit of behavioral economics. Fourth, the current mathematical education paradigm that puts “deterministic first, stochastic second” is likely responsible for the general difficulties with reasoning under uncertainty, a situation that can be improved by introducing the concept of histogram, or rather kidstogram, as early as the concept of counting.","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84393771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comment on “Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw Your Kidstogram,” by Xiao-Li Meng","authors":"E. Kolaczyk","doi":"10.51387/22-nejsds6c","DOIUrl":"https://doi.org/10.51387/22-nejsds6c","url":null,"abstract":"","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78069420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comment on “Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw your Kidstogram” by Xiao-Li Meng","authors":"C. Franklin","doi":"10.51387/22-nejsds6d","DOIUrl":"https://doi.org/10.51387/22-nejsds6d","url":null,"abstract":"","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"213 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79265392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Radical and Not-So-Radical Principles and Practices: Discussion of Meng","authors":"R. Wasserstein, A. Schirm, N. Lazar","doi":"10.51387/22-nejsds6a","DOIUrl":"https://doi.org/10.51387/22-nejsds6a","url":null,"abstract":"We highlight points of agreement between Meng’s suggested principles and those proposed in our 2019 editorial in The American Statistician. We also discuss some questions that arise in the application of Meng’s principles in practice.","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82811017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Total i3+3 (Ti3+3) Design for Assessing Multiple Types and Grades of Toxicity in Phase I Trials","authors":"Meizi Liu, Yuan Ji, Ji Lin","doi":"10.51387/22-nejsds7","DOIUrl":"https://doi.org/10.51387/22-nejsds7","url":null,"abstract":"Phase I trials investigate the toxicity profile of a new treatment and identify the maximum tolerated dose for further evaluation. Most phase I trials use a binary dose-limiting toxicity endpoint to summarize the toxicity profile of a dose. In reality, reported toxicity information is much more abundant, including various types and grades of adverse events. Building upon the i3+3 design (Liu et al., 2020), we propose the Ti3+3 design, in which the letter “T” represents “total” toxicity. The proposed design takes into account multiple toxicity types and grades by computing the toxicity burden at each dose. The Ti3+3 design aims to achieve desirable operating characteristics using a simple statistics framework that utilizes“toxicity burden interval” (TBI). Simulation results show that Ti3+3 demonstrates comparable performance with existing more complex designs.","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83280352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}