{"title":"Density estimation via Bayesian inference engines","authors":"M. P. Wand, J. C. F. Yu","doi":"10.1007/s10182-021-00422-8","DOIUrl":"10.1007/s10182-021-00422-8","url":null,"abstract":"<div><p>We explain how effective automatic probability density function estimates can be constructed using contemporary Bayesian inference engines such as those based on no-U-turn sampling and expectation propagation. Extensive simulation studies demonstrate that the proposed density estimates have excellent comparative performance and scale well to very large sample sizes due to a binning strategy. Moreover, the approach is fully Bayesian and all estimates are accompanied by point-wise credible intervals. An accompanying package in the <span>R</span> language facilitates easy use of the new density estimates.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43491620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RR-classifier: a nonparametric classification procedure in multidimensional space based on relative ranks","authors":"Ondrej Vencalek, Olusola Samuel Makinde","doi":"10.1007/s10182-021-00423-7","DOIUrl":"10.1007/s10182-021-00423-7","url":null,"abstract":"<div><p>Notions of data depth have motivated nonparametric multivariate analysis, especially in supervised learning. Maximum depth classifiers, classifiers based on depth-depth plots and depth distribution classifiers are nonparametric classification methodologies based on the notions of data depth and are Bayes-optimal rule under certain conditions. This paper proposes rank-rank plot for classification. Theoretical properties of the suggested classifier are investigated in some particular cases given by specific distributional assumptions. The performance of the proposed classification method is further investigated using simulated datasets.\u0000</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50101338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Bayes modelling of penalty conversion rates of Bundesliga players","authors":"Christoph Hanck, Martin C. Arnold","doi":"10.1007/s10182-021-00420-w","DOIUrl":"10.1007/s10182-021-00420-w","url":null,"abstract":"<div><p>Judging by its significant potential to affect the outcome of a game in one single action, the penalty kick is arguably the most important set piece in football. Scientific studies on how the ability to convert a penalty kick is distributed among professional football players are scarce. In this paper, we consider how to rank penalty takers in the German Bundesliga based on historical data from 1963 to 2021. We use Bayesian models that improve inference on ability measures of individual players by imposing structural assumptions on an associated high-dimensional parameter space. These methods prove useful for our application, coping with the inherent difficulty that many players only take few penalties, making purely frequentist inference rather unreliable.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2021-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-021-00420-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47083807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Todd Colin Pataky, Konrad Abramowicz, Dominik Liebl, Alessia Pini, Sara Sjöstedt de Luna, Lina Schelin
{"title":"Simultaneous inference for functional data in sports biomechanics","authors":"Todd Colin Pataky, Konrad Abramowicz, Dominik Liebl, Alessia Pini, Sara Sjöstedt de Luna, Lina Schelin","doi":"10.1007/s10182-021-00418-4","DOIUrl":"10.1007/s10182-021-00418-4","url":null,"abstract":"<div><p>The recent sports science literature conveys a growing interest in robust statistical methods to analyze smooth, regularly-sampled functional data. This paper focuses on the inferential problem of identifying the parts of a functional domain where two population means differ. We considered four approaches recently used in sports science: interval-wise testing (IWT), statistical parametric mapping (SPM), statistical nonparametric mapping (SnPM) and the Benjamini-Hochberg (BH) procedure for false discovery control. We applied these procedures to both six representative sports science datasets, and also to systematically varied simulated datasets which replicated ten signal- and/or noise-relevant parameters that were identified in the experimental datasets. We observed generally higher IWT and BH sensitivity for five of the six experimental datasets. BH was the most sensitive procedure in simulation, but also had relatively high false positive rates (generally > 0.1) which increased sharply (> 0.3) in certain extreme simulation scenarios including highly rough data. SPM and SnPM were more sensitive than IWT in simulation except for (1) high roughness, (2) high nonstationarity, and (3) highly nonuniform smoothness. These results suggest that the optimum procedure is both signal and noise-dependent. We conclude that: (1) BH is most sensitive but also susceptible to high false positive rates, (2) IWT, SPM and SnPM appear to have relatively inconsequential differences in terms of domain identification sensitivity, except in cases of extreme signal/noise characteristics, where IWT appears to be superior at identifying a greater portion of the true signal.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-021-00418-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48085902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Bayesian nonparametric multi-sample test in any dimension","authors":"Luai Al-Labadi, Forough Fazeli Asl, Zahra Saberi","doi":"10.1007/s10182-021-00419-3","DOIUrl":"10.1007/s10182-021-00419-3","url":null,"abstract":"<div><p>This paper considers a general Bayesian test for the multi-sample problem. Specifically, for <i>M</i> independent samples, the interest is to determine whether the <i>M</i> samples are generated from the same multivariate population. First, <i>M</i> Dirichlet processes are considered as priors for the true distributions generated the data. Then, the concentration of the distribution of the total distance between the <i>M</i> posterior processes is compared to the concentration of the distribution of the total distance between the <i>M</i> prior processes through the relative belief ratio. The total distance between processes is established based on the energy distance. Various interesting theoretical results of the approach are derived. Several examples covering the high dimensional case are considered to illustrate the approach.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2021-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44175852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of final standings in football competitions with a premature ending: the case of COVID-19","authors":"P. Gorgi, S. J. Koopman, R. Lit","doi":"10.1007/s10182-021-00415-7","DOIUrl":"10.1007/s10182-021-00415-7","url":null,"abstract":"<div><p>We study an alternative approach to determine the final league table in football competitions with a premature ending. For several countries, a premature ending of the 2019/2020 football season has occurred due to the COVID-19 pandemic. We propose a model-based method as a possible alternative to the use of the incomplete standings to determine the final table. This method measures the performance of the teams in the matches of the season that have been played and predicts the remaining non-played matches through a paired-comparison model. The main advantage of the method compared to the incomplete standings is that it takes account of the bias in the performance measure due to the schedule of the matches in a season. Therefore, the resulting ranking of the teams based on our proposed method can be regarded as more fair in this respect. A forecasting study based on historical data of seven of the main European competitions is used to validate the method. The empirical results suggest that the model-based approach produces more accurate predictions of the true final standings than those based on the incomplete standings.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-021-00415-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9116379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rosa Fabbricatore, Maria Iannario, Rosaria Romano, Domenico Vistocco
{"title":"Component-based structural equation modeling for the assessment of psycho-social aspects and performance of athletes","authors":"Rosa Fabbricatore, Maria Iannario, Rosaria Romano, Domenico Vistocco","doi":"10.1007/s10182-021-00417-5","DOIUrl":"10.1007/s10182-021-00417-5","url":null,"abstract":"<div><p>Recent studies have pointed out the effect of personality traits on athletes’ performance and success; however, fewer analyses have focused the relation among these features and specific athletic behaviors, skills, and strategies to enhance performance. To fill this void, the present paper provides evidence on what personality traits mostly affect athletes’ mental skills and, in turn, their effect on the performance of a sample of elite swimmers. The main findings were obtained by exploiting a component-based structural equation modeling which allows to analyze the relationships among some psychological constructs, measuring personality traits and mental skills, and a construct measuring sports performance. The partial least squares path modeling was employed, as it is the most recognized method among the component-based approaches. The introduced method simultaneously encompasses latent and emergent variables. Rather than focusing only on objective behaviors or game/race outcomes, such an approach evaluates variables not directly observable related to sport performance, such as cognition and affect, considering measurement error and measurement invariance, as well as the validity and reliability of the obtained latent constructs. The obtained results could be an asset to design strategies and interventions both for coaches and swimmers establishing an innovative use of statistical methods for maximizing athletes’ performance and well-being.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-021-00417-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46089199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new class of integer-valued GARCH models for time series of bounded counts with extra-binomial variation","authors":"Huaping Chen, Qi Li, Fukang Zhu","doi":"10.1007/s10182-021-00414-8","DOIUrl":"10.1007/s10182-021-00414-8","url":null,"abstract":"<div><p>This article considers a modeling problem of integer-valued time series of bounded counts in which the binomial index of dispersion of the observations is greater than one, i.e., the observations inhere the characteristic of extra-binomial variation. Most methods analyzing such characteristic are based on the conditional mean process instead of the observed process itself. To fill this gap, we introduce a new class of beta-binomial integer-valued GARCH models, establish the geometric moment contracting property of its conditional mean process, discuss the stationarity and ergodicity of the observed process and its conditional mean process, and give some stochastic properties of them. We consider the conditional maximum likelihood estimates and establish the asymptotic properties of the estimators. The performances of these estimators are compared via simulation studies. Finally, we apply the proposed models to two real data sets.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2021-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s10182-021-00414-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44690126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel G. Fadel, Sebastian Mair, Ricardo da Silva Torres, Ulf Brefeld
{"title":"Contextual movement models based on normalizing flows","authors":"Samuel G. Fadel, Sebastian Mair, Ricardo da Silva Torres, Ulf Brefeld","doi":"10.1007/s10182-021-00412-w","DOIUrl":"10.1007/s10182-021-00412-w","url":null,"abstract":"<div><p>Movement models predict positions of players (or objects in general) over time and are thus key to analyzing spatiotemporal data as it is often used in sports analytics. Existing movement models are either designed from physical principles or are entirely data-driven. However, the former suffers from oversimplifications to achieve feasible and interpretable models, while the latter relies on computationally costly, from a current point of view, nonparametric density estimations and require maintaining multiple estimators, each responsible for different types of movements (e.g., such as different velocities). In this paper, we propose a unified contextual probabilistic movement model based on normalizing flows. Our approach learns the desired densities by directly optimizing the likelihood and maintains only a single contextual model that can be conditioned on auxiliary variables. Training is simultaneously performed on all observed types of movements, resulting in an effective and efficient movement model. We empirically evaluate our approach on spatiotemporal data from professional soccer. Our findings show that our approach outperforms the state of the art while being orders of magnitude more efficient with respect to computation time and memory requirements.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s10182-021-00412-w","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45616832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hans Van Eetvelde, Lars Magnus Hvattum, Christophe Ley
{"title":"The Probabilistic Final Standing Calculator: a fair stochastic tool to handle abruptly stopped football seasons","authors":"Hans Van Eetvelde, Lars Magnus Hvattum, Christophe Ley","doi":"10.1007/s10182-021-00416-6","DOIUrl":"10.1007/s10182-021-00416-6","url":null,"abstract":"<div><p>The COVID-19 pandemic has left its marks in the sports world, forcing the full stop of all sports-related activities in the first half of 2020. Football leagues were suddenly stopped, and each country was hesitating between a relaunch of the competition and a premature ending. Some opted for the latter option and took as the final standing of the season the ranking from the moment the competition got interrupted. This decision has been perceived as unfair, especially by those teams who had remaining matches against easier opponents. In this paper, we introduce a tool to calculate in a fairer way the final standings of domestic leagues that have to stop prematurely: our Probabilistic Final Standing Calculator (PFSC). It is based on a stochastic model taking into account the results of the matches played and simulating the remaining matches, yielding the probabilities for the various possible final rankings. We have compared our PFSC with state-of-the-art prediction models, using previous seasons which we pretend to stop at different points in time. We illustrate our PFSC by showing how a probabilistic ranking of the French Ligue 1 in the stopped 2019–2020 season could have led to alternative, potentially fairer, decisions on the final standing.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2021-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s10182-021-00416-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9467960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}