{"title":"Differential equations in data analysis","authors":"I. Dattner","doi":"10.1002/wics.1534","DOIUrl":"https://doi.org/10.1002/wics.1534","url":null,"abstract":"Differential equations have proven to be a powerful mathematical tool in science and engineering, leading to better understanding, prediction, and control of dynamic processes. In this paper, we review the role played by differential equations in data analysis. More specifically, we consider the intersection between differential equations and data analysis in the light of modern statistical learning methodologies.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1534","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47930443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On semiparametric regression in functional data analysis","authors":"N. Ling, P. Vieu","doi":"10.1002/wics.1538","DOIUrl":"https://doi.org/10.1002/wics.1538","url":null,"abstract":"The aim of this paper is to provide a selected advanced review on semiparametric regression which is an emergent promising field of researches in functional data analysis. As a deliberate strategy, we decided to focus our discussion on the single functional index regression (SFIR) model in order to fix the ideas about the stakes linked with infinite dimensional problems and about the methodological challenges that one has to solve when building statistical procedure: one of the most challenging issue being the question of dimensionality effects reduction. This will be the first (and the main) part of this discussion and a complete survey of the literature on SFIR model will be presented. In a second attempt, other semiparametric models (and more generally, other dimension reduction models) will be shortly discussed with the double goal of presenting the state of art and of defining challenging tracks for the future. At the end, we will discuss how additive modeling is an appealing idea for more complicated models involving multifunctional predictors and some tracks for the future will be pointed in this setting.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2020-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1538","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48091620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modern Monte Carlo methods for efficient uncertainty quantification and propagation: A survey","authors":"Jiaxin Zhang","doi":"10.1002/wics.1539","DOIUrl":"https://doi.org/10.1002/wics.1539","url":null,"abstract":"Uncertainty quantification (UQ) includes the characterization, integration, and propagation of uncertainties that result from stochastic variations and a lack of knowledge or data in the natural world. Monte Carlo (MC) method is a sampling‐based approach that has widely used for quantification and propagation of uncertainties. However, the standard MC method is often time‐consuming if the simulation‐based model is computationally intensive. This article gives an overview of modern MC methods to address the existing challenges of the standard MC in the context of UQ. Specifically, multilevel Monte Carlo (MLMC) extending the concept of control variates achieves a significant reduction of the computational cost by performing most evaluations with low accuracy and corresponding low cost, and relatively few evaluations at high accuracy and corresponding high cost. Multifidelity Monte Carlo (MFMC) accelerates the convergence of standard Monte Carlo by generalizing the control variates with different models having varying fidelities and varying computational costs. Multimodel Monte Carlo method (MMMC), having a different setting of MLMC and MFMC, aims to address the issue of UQ and propagation when data for characterizing probability distributions are limited. Multimodel inference combined with importance sampling is proposed for quantifying and efficiently propagating the uncertainties resulting from small data sets. All of these three modern MC methods achieve a significant improvement of computational efficiency for probabilistic UQ, particularly uncertainty propagation. An algorithm summary and the corresponding code implementation are provided for each of the modern MC methods. The extension and application of these methods are discussed in detail.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1539","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48021151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian mixture models for cytometry data analysis","authors":"Lin Lin, B. Hejblum","doi":"10.1002/wics.1535","DOIUrl":"https://doi.org/10.1002/wics.1535","url":null,"abstract":"Bayesian mixture models are increasingly used for model‐based clustering and the follow‐up analysis on the clusters identified. As such, they are of particular interest for analyzing cytometry data where unsupervised clustering and association studies are often part of the scientific questions. Cytometry data are large quantitative data measured in a multidimensional space that typically ranges from a few dimensions to several dozens, and which keeps increasing due to innovative high‐throughput biotechonologies. We present several recent parametric and nonparametric Bayesian mixture modeling approaches, and describe advantages and limitations of these models under different research context for cytometry data analysis. We also acknowledge current computational challenges associated with the use of Bayesian mixture models for analyzing cytometry data, and we draw attention to recent developments in advanced numerical algorithms for estimating large Bayesian mixture models, which we believe have the potential to make Bayesian mixture model more applicable to new types of single‐cell data with higher dimensions.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2020-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1535","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48335075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Item response theory and its applications in educational measurement Part I: Item response theory and its implementation in R","authors":"Kazuki Hori, Hirotaka Fukuhara, Tsuyoshi Yamada","doi":"10.1002/wics.1531","DOIUrl":"https://doi.org/10.1002/wics.1531","url":null,"abstract":"Item response theory (IRT) is a class of latent variable models, which are used to develop educational and psychological tests (e.g., standardized tests, personality tests, tests for licensure, and certification). We review the theory and practices of IRT across two articles. In Part 1, we provide a broad range of topics such as foundations of educational measurement, basics of IRT, and applications of IRT using R. We focus particularly on the topics that the mirt package covers. These include unidimensional and multidimensional IRT models for dichotomous and polytomous items with continuous and discrete factors, confirmatory analysis and multigroup analysis in IRT, and estimation algorithms. In Part 2, on the other hand, we focus on more practical aspects of IRT, namely scoring, scaling, and equating.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2020-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1531","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44495133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Meyer, M. Edwards, P. Maturana-Russel, N. Christensen
{"title":"Computational techniques for parameter estimation of gravitational wave signals","authors":"R. Meyer, M. Edwards, P. Maturana-Russel, N. Christensen","doi":"10.1002/wics.1532","DOIUrl":"https://doi.org/10.1002/wics.1532","url":null,"abstract":"Since the very first detection of gravitational waves from the coalescence of two black holes in 2015, Bayesian statistical methods have been routinely applied by LIGO and Virgo to extract the signal out of noisy interferometric measurements, obtain point estimates of the physical parameters responsible for producing the signal, and rigorously quantify their uncertainties. Different computational techniques have been devised depending on the source of the gravitational radiation and the gravitational waveform model used. Prominent sources of gravitational waves are binary black hole or neutron star mergers, the only objects that have been observed by detectors to date. But also gravitational waves from core‐collapse supernovae, rapidly rotating neutron stars, and the stochastic gravitational‐wave background are in the sensitivity band of the ground‐based interferometers and expected to be observable in future observation runs. As nonlinearities of the complex waveforms and the high‐dimensional parameter spaces preclude analytic evaluation of the posterior distribution, posterior inference for all these sources relies on computer‐intensive simulation techniques such as Markov chain Monte Carlo methods. A review of state‐of‐the‐art Bayesian statistical parameter estimation methods will be given for researchers in this cross‐disciplinary area of gravitational wave data analysis.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2020-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1532","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47899957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conway–Maxwell–Poisson regression models for dispersed count data","authors":"Kimberly F. Sellers, Bailey Premeaux","doi":"10.1002/wics.1533","DOIUrl":"https://doi.org/10.1002/wics.1533","url":null,"abstract":"While Poisson regression serves as a standard tool for modeling the association between a count response variable and explanatory variables, it is well‐documented that this approach is limited by the Poisson model's assumption of data equi‐dispersion. The Conway–Maxwell–Poisson (COM‐Poisson) distribution has demonstrated itself as a viable alternative for real count data that express data over‐ or under‐dispersion, and thus the COM‐Poisson regression can flexibly model associations involving a discrete count response variable and covariates. This work overviews the ongoing developmental knowledge and advancement of COM‐Poisson regression, introducing the reader to the underlying model (and its considered reparametrizations) and related regression constructs, including zero‐inflated models, and longitudinal studies. This manuscript further introduces readers to associated computing tools available to perform COM‐Poisson and related regressions.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2020-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1533","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43051686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data analysis on nonstandard spaces","authors":"S. Huckemann, B. Eltzner","doi":"10.1002/wics.1526","DOIUrl":"https://doi.org/10.1002/wics.1526","url":null,"abstract":"The task to write on data analysis on nonstandard spaces is quite substantial, with a huge body of literature to cover, from parametric to nonparametrics, from shape spaces to Wasserstein spaces. In this survey we convey simple (e.g., Fréchet means) and more complicated ideas (e.g., empirical process theory), common to many approaches with focus on their interaction with one‐another. Indeed, this field is fast growing and it is imperative to develop a mathematical view point, drawing power, and diversity from a higher level of abstraction, for example, by introducing generalized Fréchet means. While many problems have found ingenious solutions (e.g., Procrustes analysis for principal component analysis [PCA] extensions on shape spaces and diffusion on the frame bundle to mimic anisotropic Gaussians), more problems emerge, often more difficult (e.g., topology and geometry influencing limiting rates and defining generic intrinsic PCA extensions). Along this survey, we point out some open problems, that will, as it seems, keep mathematicians, statisticians, computer and data scientists busy for a while.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1526","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42836597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Competing risks analysis for discrete time‐to‐event data","authors":"M. Schmid, M. Berger","doi":"10.1002/wics.1529","DOIUrl":"https://doi.org/10.1002/wics.1529","url":null,"abstract":"This article presents an overview of statistical methods for the analysis of discrete failure times with competing events. We describe the most commonly used modeling approaches for this type of data, including discrete versions of the cause‐specific hazards model and the subdistribution hazard model. In addition to discussing the characteristics of these methods, we present approaches to nonparametric estimation and model validation. Our literature review suggests that discrete competing‐risks analysis has gained substantial interest in the research community and is used regularly in econometrics, biostatistics, and educational research.","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/wics.1529","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42117090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}