{"title":"A Bayesian framework to evaluate evidence in cases of alleged cheating with secret codes in sports","authors":"Aafko Boonstra, Ronald Meester","doi":"arxiv-2409.08172","DOIUrl":"https://doi.org/arxiv-2409.08172","url":null,"abstract":"We present a Bayesian framework to analyze a case of alleged cheating in the\u0000mind sport contract bridge. We explain why a Bayesian approach is called for,\u0000and not a frequentistic one. We argue that such a Bayesian framework can and\u0000should also be used in other sports for cases of alleged cheating by means of\u0000illegal signalling.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy Löfstedt, Erik Elmroth
{"title":"A Cost-Aware Approach to Adversarial Robustness in Neural Networks","authors":"Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy Löfstedt, Erik Elmroth","doi":"arxiv-2409.07609","DOIUrl":"https://doi.org/arxiv-2409.07609","url":null,"abstract":"Considering the growing prominence of production-level AI and the threat of\u0000adversarial attacks that can evade a model at run-time, evaluating the\u0000robustness of models to these evasion attacks is of critical importance.\u0000Additionally, testing model changes likely means deploying the models to (e.g.\u0000a car or a medical imaging device), or a drone to see how it affects\u0000performance, making un-tested changes a public problem that reduces development\u0000speed, increases cost of development, and makes it difficult (if not\u0000impossible) to parse cause from effect. In this work, we used survival analysis\u0000as a cloud-native, time-efficient and precise method for predicting model\u0000performance in the presence of adversarial noise. For neural networks in\u0000particular, the relationships between the learning rate, batch size, training\u0000time, convergence time, and deployment cost are highly complex, so researchers\u0000generally rely on benchmark datasets to assess the ability of a model to\u0000generalize beyond the training data. To address this, we propose using\u0000accelerated failure time models to measure the effect of hardware choice, batch\u0000size, number of epochs, and test-set accuracy by using adversarial attacks to\u0000induce failures on a reference model architecture before deploying the model to\u0000the real world. We evaluate several GPU types and use the Tree Parzen Estimator\u0000to maximize model robustness and minimize model run-time simultaneously. This\u0000provides a way to evaluate the model and optimise it in a single step, while\u0000simultaneously allowing us to model the effect of model parameters on training\u0000time, prediction time, and accuracy. Using this technique, we demonstrate that\u0000newer, more-powerful hardware does decrease the training time, but with a\u0000monetary and power cost that far outpaces the marginal gains in accuracy.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Edgar Santos-Fernandez, Jay M. Ver Hoef, Erin E. Peterson, James McGree, Cesar A. Villa, Catherine Leigh, Ryan Turner, Cameron Roberts, Kerrie Mengersen
{"title":"Unsupervised anomaly detection in spatio-temporal stream network sensor data","authors":"Edgar Santos-Fernandez, Jay M. Ver Hoef, Erin E. Peterson, James McGree, Cesar A. Villa, Catherine Leigh, Ryan Turner, Cameron Roberts, Kerrie Mengersen","doi":"arxiv-2409.07667","DOIUrl":"https://doi.org/arxiv-2409.07667","url":null,"abstract":"The use of in-situ digital sensors for water quality monitoring is becoming\u0000increasingly common worldwide. While these sensors provide near real-time data\u0000for science, the data are prone to technical anomalies that can undermine the\u0000trustworthiness of the data and the accuracy of statistical inferences,\u0000particularly in spatial and temporal analyses. Here we propose a framework for\u0000detecting anomalies in sensor data recorded in stream networks, which takes\u0000advantage of spatial and temporal autocorrelation to improve detection rates.\u0000The proposed framework involves the implementation of effective data imputation\u0000to handle missing data, alignment of time-series to address temporal\u0000disparities, and the identification of water quality events. We explore the\u0000effectiveness of a suite of state-of-the-art statistical methods including\u0000posterior predictive distributions, finite mixtures, and Hidden Markov Models\u0000(HMM). We showcase the practical implementation of automated anomaly detection\u0000in near-real time by employing a Bayesian recursive approach. This\u0000demonstration is conducted through a comprehensive simulation study and a\u0000practical application to a substantive case study situated in the Herbert\u0000River, located in Queensland, Australia, which flows into the Great Barrier\u0000Reef. We found that methods such as posterior predictive distributions and HMM\u0000produce the best performance in detecting multiple types of anomalies.\u0000Utilizing data from multiple sensors deployed relatively near one another\u0000enhances the ability to distinguish between water quality events and technical\u0000anomalies, thereby significantly improving the accuracy of anomaly detection.\u0000Thus, uncertainty and biases in water quality reporting, interpretation, and\u0000modelling are reduced, and the effectiveness of subsequent management actions\u0000improved.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Hart, I. Manickam, M. Gulian, L. Swiler, D. Bull, T. Ehrmann, H. Brown, B. Wagman, J. Watkins
{"title":"Stratospheric aerosol source inversion: Noise, variability, and uncertainty quantification","authors":"J. Hart, I. Manickam, M. Gulian, L. Swiler, D. Bull, T. Ehrmann, H. Brown, B. Wagman, J. Watkins","doi":"arxiv-2409.06846","DOIUrl":"https://doi.org/arxiv-2409.06846","url":null,"abstract":"Stratospheric aerosols play an important role in the earth system and can\u0000affect the climate on timescales of months to years. However, estimating the\u0000characteristics of partially observed aerosol injections, such as those from\u0000volcanic eruptions, is fraught with uncertainties. This article presents a\u0000framework for stratospheric aerosol source inversion which accounts for\u0000background aerosol noise and earth system internal variability via a Bayesian\u0000approximation error approach. We leverage specially designed earth system model\u0000simulations using the Energy Exascale Earth System Model (E3SM). A\u0000comprehensive framework for data generation, data processing, dimension\u0000reduction, operator learning, and Bayesian inversion is presented where each\u0000component of the framework is designed to address particular challenges in\u0000stratospheric modeling on the global scale. We present numerical results using\u0000synthesized observational data to rigorously assess the ability of our approach\u0000to estimate aerosol sources and associate uncertainty with those estimates.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Coralie FritschIECL, SIMBA, Marie GrosdidierBioSP, Anne Gégout-PetitIECL, SIMBA, Benoit MarçaisIAM
{"title":"Mechanistic-statistical model for the expansion of ash dieback","authors":"Coralie FritschIECL, SIMBA, Marie GrosdidierBioSP, Anne Gégout-PetitIECL, SIMBA, Benoit MarçaisIAM","doi":"arxiv-2409.06273","DOIUrl":"https://doi.org/arxiv-2409.06273","url":null,"abstract":"Hymenoscyphus fraxineus is an invasive forest fungal pathogen that induces\u0000severe dieback in European ash populations. The spread of the disease has been\u0000closely monitored in France by the forest health survey system. We have\u0000developed a mechanisticstatistical model that describes the spread of the\u0000disease. It takes into account climate (summer temperature and spring\u0000rainfall), pathogen population dynamics (foliar infection, Allee effect induced\u0000by limited sexual partner encounters) and host density. We fitted this model\u0000using available disease reports. We estimated the parameters of our model,\u0000first identifying the appropriate ranges for the parameters, which led to a\u0000model reduction, and then using an adaptive multiple importance sampling\u0000algorithm for fitting. The model reproduces well the propagation observed in\u0000France over the last 20 years. In particular, it predicts the absence of\u0000disease impact in the south-east of the country and its weak development in the\u0000Garonne valley in south-west France. Summer temperature is the factor with the\u0000highest overall effect on disease spread, and explains the limited impact in\u0000southern France. Among the different temperature indices tested, the number of\u0000summer days with temperatures above 28{textdegree}C gave the best qualitative\u0000behavior and the best fit. In contrast, the Allee effect and the heterogeneity\u0000of spring precipitation did not strongly affect the overall expansion of H.\u0000fraxineus in France and could be neglected in the modeling process. The model\u0000can be used to infer the average annual dispersal of H. fraxineus in France.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Teacher-student relationship and teaching styles in primary education. A model of analysis","authors":"Maria-Eugenia Cardenal, Octavio-David Diaz-Santana, Sara-Maria Gonzalez-Betancor","doi":"arxiv-2409.06562","DOIUrl":"https://doi.org/arxiv-2409.06562","url":null,"abstract":"Purpose: The teacher role in the classroom can explain important aspects of\u0000the student's school experience. The teacher-student relationship, a central\u0000dimension of social capital, influences students' engagement, and the teaching\u0000style plays an important role in student outcomes. But there is scarce\u0000literature that links teaching styles to teacher-student relationship. This\u0000article aims to: 1) analyze whether there is a relationship between teaching\u0000styles and the type of relationship perceived by students; 2) test whether this\u0000relationship is equally strong for any teaching style; and 3) determine the\u0000extent to which students' perceptions vary according to their profile.\u0000Design/methodology/approach: A structural equation model with four latent\u0000variables is estimated: two for the teacher-student relationship (emotional vs.\u0000educational) and two for the teaching styles (directive vs. participative),\u0000with information for 21126 sixth-grade primary-students in 2019 in Spain.\u0000Findings: Teacher-student relationships and teaching styles are interconnected.\u0000The participative style implies a better relationship. The perceptions of the\u0000teacher are heterogeneous, depending on gender (girls perceive clearer than\u0000boys) and with the educational background (children from lower educational\u0000background perceive both types of teaching styles more clearly).\u0000Originality/value: The analysis is based on the point of view of the addressee\u0000of the teacher's work, i.e. the student. It provides a model that can be\u0000replicated in any other education system. The latent variables, based on a\u0000periodically administered questionnaire, could be estimated with data from\u0000diagnostic assessments in other countries, which in turn would allow the\u0000formulation of context-specific educational policy proposals that take into\u0000account student feedback.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intrinsic geometry-inspired dependent toroidal distribution: Application to regression model for astigmatism data","authors":"Buddhananda Banerjee, Surojit Biswas","doi":"arxiv-2409.06229","DOIUrl":"https://doi.org/arxiv-2409.06229","url":null,"abstract":"This paper introduces a dependent toroidal distribution, to analyze\u0000astigmatism data following cataract surgery. Rather than utilizing the flat\u0000torus, we opt to represent the bivariate angular data on the surface of a\u0000curved torus, which naturally offers smooth edge identifiability and\u0000accommodates a variety of curvatures: positive, negative, and zero. Beginning\u0000with the area-uniform toroidal distribution on this curved surface, we develop\u0000a five-parameter-dependent toroidal distribution that harnesses its intrinsic\u0000geometry via the area element to model the distribution of two dependent\u0000circular random variables. We show that both marginal distributions are\u0000Cardioid, with one of the conditional variables also following a Cardioid\u0000distribution. This key feature enables us to propose a circular-circular\u0000regression model based on conditional expectations derived from circular\u0000moments. To address the high rejection rate (approximately 50%) in existing\u0000acceptance-rejection sampling methods for Cardioid distributions, we introduce\u0000an exact sampling method based on a probabilistic transformation. Additionally,\u0000we generate random samples from the proposed dependent toroidal distribution\u0000through suitable conditioning. This bivariate distribution and the regression\u0000model are applied to analyze astigmatism data arising in the follow-up of one\u0000and three months due to cataract surgery.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Monitoring road infrastructures from satellite images in Greater Maputo: an object-oriented classification approach","authors":"Arianna Burzacchi, Matteo Landrò, Simone Vantini","doi":"arxiv-2409.06406","DOIUrl":"https://doi.org/arxiv-2409.06406","url":null,"abstract":"The information about pavement surface type is rarely available in road\u0000network databases of developing countries although it represents a cornerstone\u0000of the design of efficient mobility systems. This research develops an\u0000automatic classification pipeline for road pavement which makes use of\u0000satellite images to recognize road segments as paved or unpaved. The proposed\u0000methodology is based on an object-oriented approach, so that each road is\u0000classified by looking at the distribution of its pixels in the RGB space. The\u0000proposed approach is proven to be accurate, inexpensive, and readily replicable\u0000in other cities.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simon N. Wood, Ernst C. Wit, Paul M. McKeigue, Danshu Hu, Beth Flood, Lauren Corcoran, Thea Abou Jawad
{"title":"Some statistical aspects of the Covid-19 response","authors":"Simon N. Wood, Ernst C. Wit, Paul M. McKeigue, Danshu Hu, Beth Flood, Lauren Corcoran, Thea Abou Jawad","doi":"arxiv-2409.06473","DOIUrl":"https://doi.org/arxiv-2409.06473","url":null,"abstract":"This paper discusses some statistical aspects of the U.K. Covid-19 pandemic\u0000response, focussing particularly on cases where we believe that a statistically\u0000questionable approach or presentation has had a substantial impact on public\u0000perception, or government policy, or both. We discuss the presentation of\u0000statistics relating to Covid risk, and the risk of the response measures,\u0000arguing that biases tended to operate in opposite directions, overplaying Covid\u0000risk and underplaying the response risks. We also discuss some issues around\u0000presentation of life loss data, excess deaths and the use of case data. The\u0000consequences of neglect of most individual variability from epidemic models,\u0000alongside the consequences of some other statistically important omissions are\u0000also covered. Finally the evidence for full stay at home lockdowns having been\u0000necessary to reverse waves of infection is examined, with new analyses provided\u0000for a number of European countries.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dayi Li, Gwendolyn Eadie, Patrick Brown, William Harris, Roberto Abraham, Pieter van Dokkum, Steven Janssens, Samantha Berek, Shany Danieli, Aaron Romanowsky, Joshua Speagle
{"title":"Discovery of Two Ultra-Diffuse Galaxies with Unusually Bright Globular Cluster Luminosity Functions via a Mark-Dependently Thinned Point Process (MATHPOP)","authors":"Dayi Li, Gwendolyn Eadie, Patrick Brown, William Harris, Roberto Abraham, Pieter van Dokkum, Steven Janssens, Samantha Berek, Shany Danieli, Aaron Romanowsky, Joshua Speagle","doi":"arxiv-2409.06040","DOIUrl":"https://doi.org/arxiv-2409.06040","url":null,"abstract":"We present textsc{Mathpop}, a novel method to infer the globular cluster\u0000(GC) counts in ultra-diffuse galaxies (UDGs) and low-surface brightness\u0000galaxies (LSBGs). Many known UDGs have a surprisingly high ratio of GC number\u0000to surface brightness. However, standard methods to infer GC counts in UDGs\u0000face various challenges, such as photometric measurement uncertainties, GC\u0000membership uncertainties, and assumptions about the GC luminosity functions\u0000(GCLFs). textsc{Mathpop} tackles these challenges using the mark-dependent\u0000thinned point process, enabling joint inference of the spatial and magnitude\u0000distributions of GCs. In doing so, textsc{Mathpop} allows us to infer and\u0000quantify the uncertainties in both GC counts and GCLFs with minimal\u0000assumptions. As a precursor to textsc{Mathpop}, we also address the data\u0000uncertainties coming from the selection process of GC candidates: we obtain\u0000probabilistic GC candidates instead of the traditional binary classification\u0000based on the color--magnitude diagram. We apply textsc{Mathpop} to 40 LSBGs in\u0000the Perseus cluster using GC catalogs from a textit{Hubble Space Telescope}\u0000imaging program. We then compare our results to those from an independent study\u0000using the standard method. We further calibrate and validate our approach\u0000through extensive simulations. Our approach reveals two LSBGs having GCLF\u0000turnover points much brighter than the canonical value with Bayes' factor being\u0000$sim4.5$ and $sim2.5$, respectively. An additional crude maximum-likelihood\u0000estimation shows that their GCLF TO points are approximately $0.9$~mag and\u0000$1.1$~mag brighter than the canonical value, with $p$-value $sim 10^{-8}$ and\u0000$sim 10^{-5}$, respectively.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}