Joshua S. North, Christopher K. Wikle, Erin M. Schliep
{"title":"A Review of Data‐Driven Discovery for Dynamic Systems","authors":"Joshua S. North, Christopher K. Wikle, Erin M. Schliep","doi":"10.1111/insr.12554","DOIUrl":"https://doi.org/10.1111/insr.12554","url":null,"abstract":"Many real‐world scientific processes are governed by complex non‐linear dynamic systems that can be represented by differential equations. Recently, there has been an increased interest in learning, or discovering, the forms of the equations driving these complex non‐linear dynamic systems using data‐driven approaches. In this paper, we review the current literature on data‐driven discovery for dynamic systems. We provide a categorisation to the different approaches for data‐driven discovery and a unified mathematical framework to show the relationship between the approaches. Importantly, we discuss the role of statistics in the data‐driven discovery field, describe a possible approach by which the problem can be cast in a statistical framework and provide avenues for future work.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135132140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Penalisation Methods in Fitting High-Dimensional Cointegrated Vector Autoregressive Models: A Review","authors":"Marie Levakova, Susanne Ditlevsen","doi":"10.1111/insr.12553","DOIUrl":"10.1111/insr.12553","url":null,"abstract":"<p>Cointegration has shown useful for modeling non-stationary data with long-run equilibrium relationships among variables, with applications in many fields such as econometrics, climate research and biology. However, the analyses of vector autoregressive models are becoming more difficult as data sets of higher dimensions are becoming available, in particular because the number of parameters is quadratic in the number of variables. This leads to lack of statistical robustness, and regularisation methods are paramount for obtaining valid estimates. In the last decade, many papers have appeared suggesting different penalisation approaches to the inference problem. Here, we make a comprehensive review of different penalisation methods adapted to the specific structure of vector cointegrated models suggested in the literature, with relevant references to software packages. The methods are evaluated and compared according to a range of error measures in a simulation study, considering combinations of low and high dimension of the system and small and large sample sizes.</p>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"92 2","pages":"160-193"},"PeriodicalIF":1.7,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12553","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135014699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziqing Dong, Yves Tille, Giovanni Maria Giorgi, Alessio Guandalini
{"title":"Generalised Income Inequality Index","authors":"Ziqing Dong, Yves Tille, Giovanni Maria Giorgi, Alessio Guandalini","doi":"10.1111/insr.12551","DOIUrl":"10.1111/insr.12551","url":null,"abstract":"<p>This paper proposes a deep generalisation for income inequality indices. A generalised income inequality index that depends on two parameters and that involves a large set of income inequality indices in the same framework is proposed. The two parameters control the sensitivity of the generalised index to different levels of the income distribution. A thorough investigation of the generalised index paves the way for understanding the influence of the low, middle and high incomes on various income inequality indices and thereby facilitates the choice of multiple indices simultaneously for a better analysis of inequality as advocated by several recent studies. Moreover, two methods for estimating the generalised index in the case of finite populations are shown. A new method for estimating the inequality indices is proposed.</p>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"92 1","pages":"87-105"},"PeriodicalIF":2.0,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12551","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48981484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Number Savvy: From the Invention of Numbers to the Future of Data , George Sciadas Chapman & Hall/CRC, 2022, 312 pages, £56.99/$74.95, hardcover ISBN 9781032362151","authors":"Fabrizio Durante","doi":"10.1111/insr.12550","DOIUrl":"10.1111/insr.12550","url":null,"abstract":"","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"91 2","pages":"348"},"PeriodicalIF":2.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46364123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modern Applied Regressions: Bayesian and Frequentist Analysis of Categorical and Limited Response Variables with R and Stan , Jun Xu Chapman & Hall/CRC, 2023, xv + 281 pages, £80.99/$108, hardcover ISBN: 9780367173876 (hbk); 9781032376745 (pbk); 9780429056468 (ebk)","authors":"Shuangzhe Liu","doi":"10.1111/insr.12548","DOIUrl":"10.1111/insr.12548","url":null,"abstract":"","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"91 2","pages":"345-347"},"PeriodicalIF":2.0,"publicationDate":"2023-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46140403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anna Pajor, Justyna Wróblewska, Łukasz Kwiatkowski, Jacek Osiewalski
{"title":"Hybrid SV-GARCH, t-GARCH and Markov-switching covariance structures in VEC models—Which is better from a predictive perspective?","authors":"Anna Pajor, Justyna Wróblewska, Łukasz Kwiatkowski, Jacek Osiewalski","doi":"10.1111/insr.12546","DOIUrl":"10.1111/insr.12546","url":null,"abstract":"<div>\u0000 \u0000 <p>We compare predictive performance of a multitude of alternative Bayesian vector autoregression (VAR) models allowing for cointegration and time-varying conditional covariances, described by different multivariate stochastic volatility (MSV) models, including their hybrids with multivariate GARCH processes (MSV-MGARCH), as well as <i>t</i>-GARCH and Markov-switching structures. The forecast accuracy is evaluated mainly through predictive Bayes factors, but energy scores and the probability integral transform are also used. Two empirical studies, for the US and Polish economies, are based on a small model of monetary policy comprising inflation, unemployment and interest rate. The results indicate that capturing conditional heteroskedasticity by some MSV-MGARCH specifications contributes the most to the forecasting power of the VAR/VEC model.</p>\u0000 </div>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"92 1","pages":"62-86"},"PeriodicalIF":2.0,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41819387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Effect: An Introduction to Research Design and Causality , Nick Huntington-Klein Chapman & Hall/CRC, 2022, xiv + 620 pages, $39.95, paperback. ISBN: 9781032125787","authors":"Brian W. Sloboda","doi":"10.1111/insr.12547","DOIUrl":"10.1111/insr.12547","url":null,"abstract":"","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"91 2","pages":"343-345"},"PeriodicalIF":2.0,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49021756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patrick Toman, N. Ravishanker, S. Rajasekaran, Nathan Lally
{"title":"Online Evidential Nearest Neighbour Classification for Internet of Things Time Series","authors":"Patrick Toman, N. Ravishanker, S. Rajasekaran, Nathan Lally","doi":"10.1111/insr.12540","DOIUrl":"https://doi.org/10.1111/insr.12540","url":null,"abstract":"The ‘Internet of Things’ (IoT) is a rapidly developing set of technologies that leverages large numbers of networked sensors, to relay data in an online fashion. Typically, knowledge of the sensor environment is incomplete and subject to changes over time. There is a need to employ classification algorithms to understand the data. We first review of existing time series classification (TSC) approaches, with emphasis on the well‐known k‐nearest neighbours (kNN) methods. We extend these to dynamical kNN classifiers, and discuss their shortcomings for handling the inherent uncertainty in IoT data. We next review evidential kNN ( EkNN ) classifiers that leverage the well‐known Dempster–Shafer theory to allow principled uncertainty quantification. We develop a dynamic EkNN approach for classifying IoT streams via algorithms that use evidential theoretic pattern rejection rules for (i) classifying incoming patterns into a set of oracle classes, (ii) automatically pruning ambiguously labelled patterns such as aberrant streams (due to malfunctioning sensors, say), and (iii) identifying novel classes that may emerge in new subsequences over time. While these methods have wide applicability in many domains, we illustrate the dynamic kNN and EkNN approaches for classifying a large, noisy IoT time series dataset from an insurance firm.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":" ","pages":""},"PeriodicalIF":2.0,"publicationDate":"2023-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45000249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Edgar Santos-Fernandez, Julie Vercelloni, Aiden Price, Grace Heron, Bryce Christensen, Erin E. Peterson, Kerrie Mengersen
{"title":"Increasing Trust in New Data Sources: Crowdsourcing Image Classification for Ecology","authors":"Edgar Santos-Fernandez, Julie Vercelloni, Aiden Price, Grace Heron, Bryce Christensen, Erin E. Peterson, Kerrie Mengersen","doi":"10.1111/insr.12542","DOIUrl":"10.1111/insr.12542","url":null,"abstract":"<p>Crowdsourcing methods facilitate the production of scientific information by non-experts. This form of citizen science (CS) is becoming a key source of complementary data in many fields to inform data-driven decisions and study challenging problems. However, concerns about the validity of these data often constrain their utility. In this paper, we focus on the use of citizen science data in addressing complex challenges in environmental conservation. We consider this issue from three perspectives. First, we present a literature scan of papers that have employed Bayesian models with citizen science in ecology. Second, we compare several popular majority vote algorithms and introduce a Bayesian item response model that estimates and accounts for participants' abilities after adjusting for the difficulty of the images they have classified. The model also enables participants to be clustered into groups based on ability. Third, we apply the model in a case study involving the classification of corals from underwater images from the Great Barrier Reef, Australia. We show that the model achieved superior results in general and, for difficult tasks, a weighted consensus method that uses only groups of experts and experienced participants produced better performance measures. Moreover, we found that participants learn as they have more classification opportunities, which substantially increases their abilities over time. Overall, the paper demonstrates the feasibility of CS for answering complex and challenging ecological questions when these data are appropriately analysed. This serves as motivation for future work to increase the efficacy and trustworthiness of this emerging source of data.</p>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"92 1","pages":"43-61"},"PeriodicalIF":2.0,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12542","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43727234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}