Dayi Li, Gwendolyn Eadie, Patrick Brown, William Harris, Roberto Abraham, Pieter van Dokkum, Steven Janssens, Samantha Berek, Shany Danieli, Aaron Romanowsky, Joshua Speagle
{"title":"Discovery of Two Ultra-Diffuse Galaxies with Unusually Bright Globular Cluster Luminosity Functions via a Mark-Dependently Thinned Point Process (MATHPOP)","authors":"Dayi Li, Gwendolyn Eadie, Patrick Brown, William Harris, Roberto Abraham, Pieter van Dokkum, Steven Janssens, Samantha Berek, Shany Danieli, Aaron Romanowsky, Joshua Speagle","doi":"arxiv-2409.06040","DOIUrl":"https://doi.org/arxiv-2409.06040","url":null,"abstract":"We present textsc{Mathpop}, a novel method to infer the globular cluster\u0000(GC) counts in ultra-diffuse galaxies (UDGs) and low-surface brightness\u0000galaxies (LSBGs). Many known UDGs have a surprisingly high ratio of GC number\u0000to surface brightness. However, standard methods to infer GC counts in UDGs\u0000face various challenges, such as photometric measurement uncertainties, GC\u0000membership uncertainties, and assumptions about the GC luminosity functions\u0000(GCLFs). textsc{Mathpop} tackles these challenges using the mark-dependent\u0000thinned point process, enabling joint inference of the spatial and magnitude\u0000distributions of GCs. In doing so, textsc{Mathpop} allows us to infer and\u0000quantify the uncertainties in both GC counts and GCLFs with minimal\u0000assumptions. As a precursor to textsc{Mathpop}, we also address the data\u0000uncertainties coming from the selection process of GC candidates: we obtain\u0000probabilistic GC candidates instead of the traditional binary classification\u0000based on the color--magnitude diagram. We apply textsc{Mathpop} to 40 LSBGs in\u0000the Perseus cluster using GC catalogs from a textit{Hubble Space Telescope}\u0000imaging program. We then compare our results to those from an independent study\u0000using the standard method. We further calibrate and validate our approach\u0000through extensive simulations. Our approach reveals two LSBGs having GCLF\u0000turnover points much brighter than the canonical value with Bayes' factor being\u0000$sim4.5$ and $sim2.5$, respectively. An additional crude maximum-likelihood\u0000estimation shows that their GCLF TO points are approximately $0.9$~mag and\u0000$1.1$~mag brighter than the canonical value, with $p$-value $sim 10^{-8}$ and\u0000$sim 10^{-5}$, respectively.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Empathic Accuracy: Penalized Functional Alignment Method to Correct Misalignment in Emotional Perception","authors":"Linh H Nghiem, Jing Cao, Chul Moon","doi":"arxiv-2409.05343","DOIUrl":"https://doi.org/arxiv-2409.05343","url":null,"abstract":"Empathic accuracy (EA) is the ability of one person to accurately understand\u0000thoughts and feelings of another person, which is crucial for social and\u0000psychological interactions. Traditionally, EA is measured by comparing\u0000perceivers` real-time ratings of emotional states with the target`s\u0000self--evaluation. However, these analyses often ignore or simplify\u0000misalignments between ratings (such as assuming a fixed delay), leading to\u0000biased EA measures. We introduce a novel alignment method that accommodates\u0000diverse misalignment patterns, using the square--oot velocity representation to\u0000decompose ratings into amplitude and phase components. Additionally, we\u0000incorporate a regularization term to prevent excessive alignment by\u0000constraining temporal shifts within plausible human perception bounds. The\u0000overall alignment method is implemented effectively through a constrained\u0000dynamic programming algorithm. We demonstrate the superior performance of our\u0000method through simulations and real-world applications to video and music\u0000datasets.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning about Spatial and Temporal Proximity using Tree-Based Methods","authors":"Ines Levin","doi":"arxiv-2409.06046","DOIUrl":"https://doi.org/arxiv-2409.06046","url":null,"abstract":"Learning about the relationship between distance to landmarks and events and\u0000phenomena of interest is a multi-faceted problem, as it may require taking into\u0000account multiple dimensions, including: spatial position of landmarks, timing\u0000of events taking place over time, and attributes of occurrences and locations.\u0000Here I show that tree-based methods are well suited for the study of these\u0000questions as they allow exploring the relationship between proximity metrics\u0000and outcomes of interest in a non-parametric and data-driven manner. I\u0000illustrate the usefulness of tree-based methods vis-`a-vis conventional\u0000regression methods by examining the association between: (i) distance to border\u0000crossings along the US-Mexico border and support for immigration reform, and\u0000(ii) distance to mass shootings and support for gun control.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"UAVDB: Trajectory-Guided Adaptable Bounding Boxes for UAV Detection","authors":"Yu-Hsi Chen","doi":"arxiv-2409.06490","DOIUrl":"https://doi.org/arxiv-2409.06490","url":null,"abstract":"With the rapid development of drone technology, accurate detection of\u0000Unmanned Aerial Vehicles (UAVs) has become essential for applications such as\u0000surveillance, security, and airspace management. In this paper, we propose a\u0000novel trajectory-guided method, the Patch Intensity Convergence (PIC)\u0000technique, which generates high-fidelity bounding boxes for UAV detection tasks\u0000and no need for the effort required for labeling. The PIC technique forms the\u0000foundation for developing UAVDB, a database explicitly created for UAV\u0000detection. Unlike existing datasets, which often use low-resolution footage or\u0000focus on UAVs in simple backgrounds, UAVDB employs high-resolution video to\u0000capture UAVs at various scales, ranging from hundreds of pixels to nearly\u0000single-digit sizes. This broad-scale variation enables comprehensive evaluation\u0000of detection algorithms across different UAV sizes and distances. Applying the\u0000PIC technique, we can also efficiently generate detection datasets from\u0000trajectory or positional data, even without size information. We extensively\u0000benchmark UAVDB using YOLOv8 series detectors, offering a detailed performance\u0000analysis. Our findings highlight UAVDB's potential as a vital database for\u0000advancing UAV detection, particularly in high-resolution and long-distance\u0000tracking scenarios.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing and Forecasting the Success in the Men's Ice Hockey World (Junior) Championships Using a Dynamic Ranking Model","authors":"Vladimír Holý","doi":"arxiv-2409.05714","DOIUrl":"https://doi.org/arxiv-2409.05714","url":null,"abstract":"What factors contribute to the success of national teams in the Men's Ice\u0000Hockey World Championships and the Men's Ice Hockey World Junior Championships?\u0000This study examines whether hosting the tournament provides a home advantage;\u0000the influence of past tournament performances; the impact of players' physical\u0000characteristics such as height, weight, and age; and the value of experience\u0000from the World Championships compared to the NHL and other leagues. We employ a\u0000dynamic ranking model based on the Plackett-Luce distribution with time-varying\u0000strength parameters driven by the score. Additionally, we conduct a forecasting\u0000analysis to predict the probabilities of winning the tournament, earning a\u0000medal, and advancing to the playoff phase.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Q. Gontier, C. Tsigros, F. Horlin, J. Wiart, C. Oestges, P. De Doncker
{"title":"Modeling the Spatial Distributions of Macro Base Stations with Homogeneous Density: Theory and Application to Real Networks","authors":"Q. Gontier, C. Tsigros, F. Horlin, J. Wiart, C. Oestges, P. De Doncker","doi":"arxiv-2409.05468","DOIUrl":"https://doi.org/arxiv-2409.05468","url":null,"abstract":"Stochastic geometry is a highly studied field in telecommunications as in\u0000many other scientific fields. In the last ten years in particular, theoretical\u0000knowledge has evolved a lot, whether for the calculation of metrics to\u0000characterize interference, coverage, energy or spectral efficiency, or exposure\u0000to electromagnetic fields. Many spatial point process models have been\u0000developed but are often left aside because of their unfamiliarity, their lack\u0000of tractability in favor of the Poisson point process or the regular lattice,\u0000easier to use. This article is intended to be a short guide presenting a\u0000complete and simple methodology to follow to infer a real stationary macro\u0000antenna network using tractable spatial models. The focus is mainly on\u0000repulsive point processes and in particular on determinantal point processes\u0000which are among the most tractable repulsive point processes. This methodology\u0000is applied on Belgian and French cell towers. The results show that for all\u0000stationary distributions in France and Belgium, the best inference model is the\u0000$beta$-Ginibre point process.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Kramnik vs Nakamura: A Chess Scandal","authors":"Shiva Maharaj, Nick Polson, Vadim Sokolov","doi":"arxiv-2409.06739","DOIUrl":"https://doi.org/arxiv-2409.06739","url":null,"abstract":"We provide a statistical analysis of the recent controversy between Vladimir\u0000Kramnik (ex chess world champion) and Hikaru Nakamura. Hikaru Nakamura is a\u0000chess prodigy and a five-time United States chess champion. Kramnik called into\u0000question Nakamura's 45.5 out of 46 win streak in an online blitz contest at\u0000chess.com. We assess the weight of evidence using a priori assessment of\u0000Viswanathan Anand and the streak evidence. Based on this evidence, we show that\u0000Nakamura has a 99.6 percent chance of not cheating. We study the statistical\u0000fallacies prevalent in both their analyses. On the one hand Kramnik bases his\u0000argument on the probability of such a streak is very small. This falls\u0000precisely into the Prosecutor's Fallacy. On the other hand, Nakamura tries to\u0000refute the argument using a cherry-picking argument. This violates the\u0000likelihood principle. We conclude with a discussion of the relevant statistical\u0000literature on the topic of fraud detection and the analysis of streaks in\u0000sports data.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linfeng Zhang, Alex Bian, Changmin Jiang, Lingxiao Wu
{"title":"A Comprehensive Framework for Estimating Aircraft Fuel Consumption Based on Flight Trajectories","authors":"Linfeng Zhang, Alex Bian, Changmin Jiang, Lingxiao Wu","doi":"arxiv-2409.05429","DOIUrl":"https://doi.org/arxiv-2409.05429","url":null,"abstract":"Accurate calculation of aircraft fuel consumption plays an irreplaceable role\u0000in flight operations, optimization, and pollutant accounting. Calculating\u0000aircraft fuel consumption accurately is tricky because it changes based on\u0000different flying conditions and physical factors. Utilizing flight surveillance\u0000data, this study developed a comprehensive mathematical framework and\u0000established a link between flight dynamics and fuel consumption, providing a\u0000set of high-precision, high-resolution fuel calculation methods. It also allows\u0000other practitioners to select data sources according to specific needs through\u0000this framework. The methodology begins by addressing the functional aspects of\u0000interval fuel consumption. We apply spectral transformation techniques to mine\u0000Automatic Dependent Surveillance-Broadcast (ADS-B) data, identifying key\u0000aspects of the flight profile and establishing their theoretical relationships\u0000with fuel consumption. Subsequently, a deep neural network with tunable\u0000parameters is used to fit this multivariate function, facilitating\u0000high-precision calculations of interval fuel consumption. Furthermore, a\u0000second-order smooth monotonic interpolation method was constructed along with a\u0000novel estimation method for instantaneous fuel consumption. Numerical results\u0000have validated the effectiveness of the model. Using ADS-B and Aircraft\u0000Communications Addressing and Reporting System (ACARS) data from 2023 for\u0000testing, the average error of interval fuel consumption can be reduced to as\u0000low as $3.31%$, and the error in the integral sense of instantaneous fuel\u0000consumption is $8.86%$. These results establish this model as the state of the\u0000art, achieving the lowest estimation errors in aircraft fuel consumption\u0000calculations to date.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rating Players of Counter-Strike: Global Offensive Based on Plus/Minus value","authors":"Hongyu Xu, Sarat Moka","doi":"arxiv-2409.05052","DOIUrl":"https://doi.org/arxiv-2409.05052","url":null,"abstract":"We propose a player rating mechanism for Counter-Strike: Global Offensive (CS\u0000), a popular e-sport, by analyzing players' Plus/Minus values. The Plus/Minus\u0000value represents the average point difference between a player's team and the\u0000opponent's team across all matches the player has participated in. Using models\u0000such as regularized linear regression, logistic regression, and Bayesian linear\u0000models, we examine the relationship between player participation and team point\u0000differences. The most commonly used metric in the CS community is \"Rating 2.0,\"\u0000which focuses solely on individual performance and does not account for\u0000indirect contributions to team success. Our approach introduces a new rating\u0000system that evaluates both direct and indirect contributions of players,\u0000prioritizing those who make a tangible impact on match outcomes rather than\u0000those with the highest individual scores. This rating system could help teams\u0000distribute rewards more fairly and improve player recruitment. We believe this\u0000methodology will positively influence not only the CS community but also the\u0000broader e-sports industry.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ryan S. Brill, Ryan Yee, Sameer K. Deshpande, Abraham J. Wyner
{"title":"Moving from Machine Learning to Statistics: the case of Expected Points in American football","authors":"Ryan S. Brill, Ryan Yee, Sameer K. Deshpande, Abraham J. Wyner","doi":"arxiv-2409.04889","DOIUrl":"https://doi.org/arxiv-2409.04889","url":null,"abstract":"Expected points is a value function fundamental to player evaluation and\u0000strategic in-game decision-making across sports analytics, particularly in\u0000American football. To estimate expected points, football analysts use machine\u0000learning tools, which are not equipped to handle certain challenges. They\u0000suffer from selection bias, display counter-intuitive artifacts of overfitting,\u0000do not quantify uncertainty in point estimates, and do not account for the\u0000strong dependence structure of observational football data. These issues are\u0000not unique to American football or even sports analytics; they are general\u0000problems analysts encounter across various statistical applications,\u0000particularly when using machine learning in lieu of traditional statistical\u0000models. We explore these issues in detail and devise expected points models\u0000that account for them. We also introduce a widely applicable novel\u0000methodological approach to mitigate overfitting, using a catalytic prior to\u0000smooth our machine learning models.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}