The New England Journal of Statistics in Data Science最新文献_第4页

Scalable Marginalization of Correlated Latent Variables with Applications to Learning Particle Interaction Kernels 相关潜变量的可扩展边缘化及其在粒子相互作用核学习中的应用

The New England Journal of Statistics in Data Science Pub Date : 2022-03-16 DOI: 10.51387/22-nejsds13

Mengyang Gu, Xubo Liu, X. Fang, Sui Tang

{"title":"Scalable Marginalization of Correlated Latent Variables with Applications to Learning Particle Interaction Kernels","authors":"Mengyang Gu, Xubo Liu, X. Fang, Sui Tang","doi":"10.51387/22-nejsds13","DOIUrl":"https://doi.org/10.51387/22-nejsds13","url":null,"abstract":"Marginalization of latent variables or nuisance parameters is a fundamental aspect of Bayesian inference and uncertainty quantification. In this work, we focus on scalable marginalization of latent variables in modeling correlated data, such as spatio-temporal or functional observations. We first introduce Gaussian processes (GPs) for modeling correlated data and highlight the computational challenge, where the computational complexity increases cubically fast along with the number of observations. We then review the connection between the state space model and GPs with Matérn covariance for temporal inputs. The Kalman filter and Rauch-Tung-Striebel smoother were introduced as a scalable marginalization technique for computing the likelihood and making predictions of GPs without approximation. We introduce recent efforts on extending the scalable marginalization idea to the linear model of coregionalization for multivariate correlated output and spatio-temporal observations. In the final part of this work, we introduce a novel marginalization technique to estimate interaction kernels and forecast particle trajectories. The computational progress lies in the sparse representation of the inverse covariance matrix of the latent variables, then applying conjugate gradient for improving predictive accuracy with large data sets. The computational advances achieved in this work outline a wide range of applications in molecular dynamic simulation, cellular migration, and agent-based models.","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87968223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Comment on “Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw Your Kidstogram,” by Xiao-Li Meng 评论小李b孟的《方差翻倍，贝叶斯变脏，河豚变大，画小孩图

The New England Journal of Statistics in Data Science Pub Date : 2022-01-01 DOI: 10.51387/22-nejsds6b

T. Junk

引用次数: 0

Comments on Xiao-Li Meng’s Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw Your Kidstogram 孟晓丽的《方差翻倍，贝叶斯变脏，河豚变大，画小孩图》评论

The New England Journal of Statistics in Data Science Pub Date : 2022-01-01 DOI: 10.51387/23-nejsds6e

D. Lin

引用次数: 0

Four Types of Frequentism and Their Interplay with Bayesianism 频率主义的四种类型及其与贝叶斯主义的相互作用

The New England Journal of Statistics in Data Science Pub Date : 2022-01-01 DOI: 10.51387/22-nejsds4

James O. Berger

引用次数: 2

Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw your Kidstrogram 方差翻倍，贝叶斯变脏，吞噬你的河豚，画你的小孩图

The New England Journal of Statistics in Data Science Pub Date : 2022-01-01 DOI: 10.51387/22-nejsds6

X. Meng

{"title":"Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw your Kidstrogram","authors":"X. Meng","doi":"10.51387/22-nejsds6","DOIUrl":"https://doi.org/10.51387/22-nejsds6","url":null,"abstract":"This article expands upon my presentation to the panel on “The Radical Prescription for Change” at the 2017 ASA (American Statistical Association) symposium on A World Beyond $p<0.05$. It emphasizes that, to greatly enhance the reliability of—and hence public trust in—statistical and data scientific findings, we need to take a holistic approach. We need to lead by example, incentivize study quality, and inoculate future generations with profound appreciations for the world of uncertainty and the uncertainty world. The four “radical” proposals in the title—with all their inherent defects and trade-offs—are designed to provoke reactions and actions. First, research methodologies are trustworthy only if they deliver what they promise, even if this means that they have to be overly protective, a necessary trade-off for practicing quality-guaranteed statistics. This guiding principle may compel us to doubling variance in some situations, a strategy that also coincides with the call to raise the bar from $p<0.05$ to $p<0.005$ [3]. Second, teaching principled practicality or corner-cutting is a promising strategy to enhance the scientific community’s as well as the general public’s ability to spot—and hence to deter—flawed arguments or findings. A remarkable quick-and-dirty Bayes formula for rare events, which simply divides the prevalence by the sum of the prevalence and the false positive rate (or the total error rate), as featured by the popular radio show Car Talk, illustrates the effectiveness of this strategy. Third, it should be a routine mental exercise to put ourselves in the shoes of those who would be affected by our research finding, in order to combat the tendency of rushing to conclusions or overstating confidence in our findings. A pufferfish/selfish test can serve as an effective reminder, and can help to institute the mantra “Thou shalt not sell what thou refuseth to buy” as the most basic professional decency. Considering personal stakes in our statistical endeavors also points to the concept of behavioral statistics, in the spirit of behavioral economics. Fourth, the current mathematical education paradigm that puts “deterministic first, stochastic second” is likely responsible for the general difficulties with reasoning under uncertainty, a situation that can be improved by introducing the concept of histogram, or rather kidstogram, as early as the concept of counting.","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84393771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Comment on “Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw Your Kidstogram,” by Xiao-Li Meng 评论孟晓丽的《方差翻倍，贝叶斯变脏，河豚变大，画小孩图》

The New England Journal of Statistics in Data Science Pub Date : 2022-01-01 DOI: 10.51387/22-nejsds6c

E. Kolaczyk

引用次数: 0

Comment on “Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw your Kidstogram” by Xiao-Li Meng 评论孟小丽的《方差翻倍，贝叶斯变脏，河豚变大，画小孩图

The New England Journal of Statistics in Data Science Pub Date : 2022-01-01 DOI: 10.51387/22-nejsds6d

C. Franklin

引用次数: 0

Radical and Not-So-Radical Principles and Practices: Discussion of Meng 激进与非激进的原则与实践:孟氏论

The New England Journal of Statistics in Data Science Pub Date : 2022-01-01 DOI: 10.51387/22-nejsds6a

R. Wasserstein, A. Schirm, N. Lazar

引用次数: 0

The Total i3+3 (Ti3+3) Design for Assessing Multiple Types and Grades of Toxicity in Phase I Trials I期试验中用于评估多种类型和等级毒性的总i3+3 (Ti3+3)设计

The New England Journal of Statistics in Data Science Pub Date : 2022-01-01 DOI: 10.51387/22-nejsds7

Meizi Liu, Yuan Ji, Ji Lin

引用次数: 0

Dietary Patterns and Cancer Risk: An Overview with Focus on Methods 饮食模式与癌症风险:以方法为重点的综述

The New England Journal of Statistics in Data Science Pub Date : 2022-01-01 DOI: 10.51387/23-nejsds35

V. Edefonti, R. De Vito, M. Parpinel, M. Ferraroni

{"title":"Dietary Patterns and Cancer Risk: An Overview with Focus on Methods","authors":"V. Edefonti, R. De Vito, M. Parpinel, M. Ferraroni","doi":"10.51387/23-nejsds35","DOIUrl":"https://doi.org/10.51387/23-nejsds35","url":null,"abstract":"Traditionally, research in nutritional epidemiology has focused on specific foods/food groups or single nutrients in their relation with disease outcomes, including cancer. Dietary pattern analysis have been introduced to examine potential cumulative and interactive effects of individual dietary components of the overall diet, in which foods are consumed in combination. Dietary patterns can be identified by using evidence-based investigator-defined approaches or by using data-driven approaches, which rely on either response independent (also named “a posteriori” dietary patterns) or response dependent (also named “mixed-type” dietary patterns) multivariate statistical methods. Within the open methodological challenges related to study design, dietary assessment, identification of dietary patterns, confounding phenomena, and cancer risk assessment, the current paper provides an updated landscape review of novel methodological developments in the statistical analysis of a posteriori/mixed-type dietary patterns and cancer risk. The review starts from standard a posteriori dietary patterns from principal component, factor, and cluster analyses, including mixture models, and examines mixed-type dietary patterns from reduced rank regression, partial least squares, classification and regression tree analysis, and least absolute shrinkage and selection operator. Novel statistical approaches reviewed include Bayesian factor analysis with modeling of sparsity through shrinkage and sparse priors and frequentist focused principal component analysis. Most novelties relate to the reproducibility of dietary patterns across studies where potentialities of the Bayesian approach to factor and cluster analysis work at best.","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":"25 3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89607622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1