{"title":"On the Metricity of the Chatterjee Correlation Coefficient","authors":"Flavio Chierichetti, Mirko Giacchini, Ravi Kumar","doi":"10.1080/00031305.2025.2571183","DOIUrl":"https://doi.org/10.1080/00031305.2025.2571183","url":null,"abstract":"We show that the distance measure implied by the recently proposed Chatterjee coefficient of correlation can violate the triangle inequality, both in theory and in practice.","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"10 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bad estimation, good prediction: the Lasso in dense regimes","authors":"Andrea Bratsberg, Magne Thoresen, Jelle J. Goeman","doi":"10.1080/00031305.2025.2569464","DOIUrl":"https://doi.org/10.1080/00031305.2025.2569464","url":null,"abstract":"For high-dimensional omics data, sparsity-inducing regularization methods such as the Lasso are widely used and often yield strong predictive performance, even in settings when the assumption of sparsity is likely violated. We demonstrate that under a specific dense model, namely the high-dimensional joint latent variable model, the Lasso produces sparse prediction rules with favorable prediction error bounds, even when the underlying regression coefficient vector is not sparse at all. We further argue that this model better represents many types of omics data than sparse linear regression models. We prove that the prediction bound under this model in fact decreases with increasing number of predictors, and confirm this through simulation examples. These results highlight the need for caution when interpreting sparse prediction rules, as strong prediction accuracy of a sparse prediction rule may not imply underlying biological significance of the individual predictors.","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"22 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145241309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linear Model Estimation and Prediction for p>n","authors":"Ronald Christensen","doi":"10.1080/00031305.2025.2566251","DOIUrl":"https://doi.org/10.1080/00031305.2025.2566251","url":null,"abstract":"","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"131 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145153780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicholas D. Edwards, Enzo de Jong, Feng Liu, Stephen T. Ferguson
{"title":"Visualizing Kendall’s τ and Hidden Structures in Ranked Data","authors":"Nicholas D. Edwards, Enzo de Jong, Feng Liu, Stephen T. Ferguson","doi":"10.1080/00031305.2025.2564268","DOIUrl":"https://doi.org/10.1080/00031305.2025.2564268","url":null,"abstract":"Ranked data is commonly used in research across many fields of study including medicine, biology, psychology, and economics. One common statistic used for analyzing ranked data is Kendall’s τ coefficient, a non-parametric measure of rank correlation which describes the strength of the association between two monotonic continuous or ordinal variables. While the mathematics involved in calculating Kendall's τ is well-established, there are relatively few graphing methods available to visualize the results. Here, we describe several alternative and complementary visualization methods and provide an interactive app for graphing Kendall's τ. The resulting graphs provide a visualization of rank correlation which helps display the proportion of concordant and discordant pairs. Moreover, these methods highlight other key features of the data which are not represented by Kendall's τ alone but may nevertheless be meaningful, such as longer monotonic chains and the relationship between discrete pairs of observations. We demonstrate the utility of these approaches through several examples and compare our results to other visualization methods.","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"24 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145116181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"L1\u0000 Prominence Measures for Directed Graphs","authors":"Seungwoo Kang, Hee-Seok Oh","doi":"10.1080/00031305.2025.2563730","DOIUrl":"https://doi.org/10.1080/00031305.2025.2563730","url":null,"abstract":"We introduce novel measures, <span><img alt=\"\" data-formula-source='{\"type\":\"image\",\"src\":\"/cms/asset/58477584-a277-4c04-ac5f-557269e3076b/utas_a_2563730_ilm0002.gif\"}' src=\"//:0\"/></span><span><img alt=\"\" data-formula-source='{\"type\":\"mathjax\"}' src=\"//:0\"/><math display=\"inline\"><mrow><msub><mrow><mi>L</mi></mrow><mn>1</mn></msub></mrow></math></span> prestige and <span><img alt=\"\" data-formula-source='{\"type\":\"image\",\"src\":\"/cms/asset/c93dd86e-0514-4832-8df4-280f96b64919/utas_a_2563730_ilm0003.gif\"}' src=\"//:0\"/></span><span><img alt=\"\" data-formula-source='{\"type\":\"mathjax\"}' src=\"//:0\"/><math display=\"inline\"><mrow><msub><mrow><mi>L</mi></mrow><mn>1</mn></msub></mrow></math></span> centrality, for quantifying the prominence of each vertex in a strongly connected and directed graph by utilizing the concept of <span><img alt=\"\" data-formula-source='{\"type\":\"image\",\"src\":\"/cms/asset/c144ecd8-1e24-4050-afea-05ae74cae725/utas_a_2563730_ilm0004.gif\"}' src=\"//:0\"/></span><span><img alt=\"\" data-formula-source='{\"type\":\"mathjax\"}' src=\"//:0\"/><math display=\"inline\"><mrow><msub><mrow><mi>L</mi></mrow><mn>1</mn></msub></mrow></math></span> data depth (Vardi and Zhang, Proc. Natl. Acad. Sci. U.S.A. 97(4):1423–1426, 2000). The former measure quantifies the degree of prominence of each vertex in receiving choices, whereas the latter measure evaluates the degree of importance in giving choices. The proposed measures can handle graphs with both edge and vertex weights, as well as undirected graphs. However, examining a graph using a measure defined over a single ‘scale’ inevitably leads to a loss of information, as each vertex may exhibit distinct structural characteristics at different levels of locality. To this end, we further develop local versions of the proposed measures with a tunable locality parameter. Using these tools, we present a multiscale network analysis framework that provides much richer structural information about each vertex than a single-scale inspection. By applying the proposed measures to the networks constructed from the Seoul Mobility Flow Data, it is demonstrated that these measures accurately depict and uncover the inherent characteristics of individual city regions.","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"190 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145133501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}