Yahui Bai, Yuhe Gao, Runzhe Wan, Sheng Zhang, Rui Song
{"title":"A Review of Reinforcement Learning in Financial Applications","authors":"Yahui Bai, Yuhe Gao, Runzhe Wan, Sheng Zhang, Rui Song","doi":"10.1146/annurev-statistics-112723-034423","DOIUrl":"https://doi.org/10.1146/annurev-statistics-112723-034423","url":null,"abstract":"In recent years, there has been a growing trend of applying reinforcement learning (RL) in financial applications. This approach has shown great potential for decision-making tasks in finance. In this review, we present a comprehensive study of the applications of RL in finance and conduct a series of meta-analyses to investigate the common themes in the literature, such as the factors that most significantly affect RL's performance compared with traditional methods. Moreover, we identify challenges, including explainability, Markov decision process modeling, and robustness, that hinder the broader utilization of RL in the financial industry and discuss recent advancements in overcoming these challenges. Finally, we propose future research directions, such as benchmarking, contextual RL, multi-agent RL, and model-based RL to address these challenges and to further enhance the implementation of RL in finance.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"25 1","pages":""},"PeriodicalIF":7.9,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142642981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Modeling of Longitudinal and Survival Data","authors":"Jane-Ling Wang, Qixian Zhong","doi":"10.1146/annurev-statistics-112723-034334","DOIUrl":"https://doi.org/10.1146/annurev-statistics-112723-034334","url":null,"abstract":"In medical studies, time-to-event outcomes such as time to death or relapse of a disease are routinely recorded along with longitudinal data that are observed intermittently during the follow-up period. For various reasons, marginal approaches to model the event time, corresponding to separate approaches for survival data/longitudinal data, tend to induce bias and lose efficiency. Instead, a joint modeling approach that brings the two types of data together can reduce or eliminate the bias and yield a more efficient estimation procedure. A well-established avenue for joint modeling is the joint likelihood approach that often produces semiparametric efficient estimators for the finite-dimensional parameter vectors in both models. Through a transformation survival model with an unspecified baseline hazard function, this review introduces joint modeling that accommodates both baseline covariates and time-varying covariates. The focus is on the major challenges faced by joint modeling and how they can be overcome. A review of available software implementations and a brief discussion of future directions of the field are also included.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"246 1","pages":""},"PeriodicalIF":7.9,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Geometric Methods for Cosmological Data on the Sphere","authors":"Javier Carrón Duque, Domenico Marinucci","doi":"10.1146/annurev-statistics-040522-093748","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040522-093748","url":null,"abstract":"This review is devoted to recent developments in the statistical analysis of spherical data, strongly motivated by applications in cosmology. We start from a brief discussion of cosmological questions and motivations, arguing that most cosmological observables are spherical random fields. Then, we introduce some mathematical background on spherical random fields, including spectral representations and the construction of needlet and wavelet frames. We then focus on some specific issues, including tools and algorithms for map reconstruction (i.e., separating the different physical components that contribute to the observed field), geometric tools for testing the assumptions of Gaussianity and isotropy, and multiple testing methods to detect contamination in the field due to point sources. Although these tools are introduced in the cosmological context, they can be applied to other situations dealing with spherical data. Finally, we discuss more recent and challenging issues, such as the analysis of polarization data, which can be viewed as realizations of random fields taking values in spin fiber bundles.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"11 3","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71473809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stochastic Models of Rainfall","authors":"Paul J. Northrop","doi":"10.1146/annurev-statistics-040622-023838","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040622-023838","url":null,"abstract":"Rainfall is the main input to most hydrological systems. To assess flood risk for a catchment area, hydrologists use models that require long series of subdaily, perhaps even subhourly, rainfall data, ideally from locations that cover the area. If historical data are not sufficient for this purpose, an alternative is to simulate synthetic data from a suitably calibrated model. We review stochastic models that have a mechanistic structure, intended to mimic physical features of the rainfall processes, and are constructed using stationary point processes. We describe models for temporal and spatial-temporal rainfall and consider how they can be fitted to data. We provide an example application using a temporal model and an illustration of data simulated from a spatial-temporal model. We discuss how these models can contribute to the simulation of future rainfall that reflects our changing climate.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"88 23","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71435578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Update on Measurement Error Modeling","authors":"Mushan Li, Yanyuan Ma","doi":"10.1146/annurev-statistics-040722-043616","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040722-043616","url":null,"abstract":"The issues caused by measurement errors have been recognized for almost 90 years, and research in this area has flourished since the 1980s. We review some of the classical methods in both density estimation and regression problems with measurement errors. In both problems, we consider when the original error-free model is parametric, nonparametric, and semiparametric, in combination with different error types. We also summarize and explain some new approaches, including recent developments and challenges in the high-dimensional setting.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"20 14","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50164713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Microbiome Data","authors":"Christine B. Peterson, Satabdi Saha, Kim-Anh Do","doi":"10.1146/annurev-statistics-040522-120734","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040522-120734","url":null,"abstract":"The microbiome represents a hidden world of tiny organisms populating not only our surroundings but also our own bodies. By enabling comprehensive profiling of these invisible creatures, modern genomic sequencing tools have given us an unprecedented ability to characterize these populations and uncover their outsize impact on our environment and health. Statistical analysis of microbiome data is critical to infer patterns from the observed abundances. The application and development of analytical methods in this area require careful consideration of the unique aspects of microbiome profiles. We begin this review with a brief overview of microbiome data collection and processing and describe the resulting data structure. We then provide an overview of statistical methods for key tasks in microbiome data analysis, including data visualization, comparison of microbial abundance across groups, regression modeling, and network inference. We conclude with a discussion and highlight interesting future directions.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"20 16","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50164711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributional Regression for Data Analysis","authors":"Nadja Klein","doi":"10.1146/annurev-statistics-040722-053607","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040722-053607","url":null,"abstract":"Flexible modeling of how an entire distribution changes with covariates is an important yet challenging generalization of mean-based regression that has seen growing interest over the past decades in both the statistics and machine learning literature. This review outlines selected state-of-the-art statistical approaches to distributional regression, complemented with alternatives from machine learning. Topics covered include the similarities and differences between these approaches, extensions, properties and limitations, estimation procedures, and the availability of software. In view of the increasing complexity and availability of large-scale data, this review also discusses the scalability of traditional estimation methods, current trends, and open challenges. Illustrations are provided using data on childhood malnutrition in Nigeria and Australian electricity prices.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"88 22","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71435579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Role of Statistics in Detecting Misinformation: A Review of the State of the Art, Open Issues, and Future Research Directions","authors":"Zois Boukouvalas, Allison Shafer","doi":"10.1146/annurev-statistics-040622-033806","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040622-033806","url":null,"abstract":"With the evolution of social media, cyberspace has become the default medium for social media users to communicate, especially during high-impact events such as pandemics, natural disasters, terrorist attacks, and periods of political unrest. However, during such events, misinformation can spread rapidly on social media, affecting decision-making and creating social unrest. Identifying and curtailing the spread of misinformation during high-impact events are significant data challenges given the scarcity and variety of the data, the speed by which misinformation can propagate, and the fairness aspects associated with this societal problem. Recent statistical machine learning advances have shown promise for misinformation detection; however, key limitations still make this a significant challenge. These limitations relate to using representative and bias-free multimodal data and to the explainability, fairness, and reliable performance of a system that detects misinformation. In this article, we critically discuss the current state-of-the-art approaches that attempt to respond to these complex requirements and present major unsolved issues; future research directions; and the synergies among statistics, data science, and other sciences for detecting misinformation.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"20 15","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50164712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shape-Constrained Statistical Inference","authors":"Lutz Dümbgen","doi":"10.1146/annurev-statistics-033021-014937","DOIUrl":"https://doi.org/10.1146/annurev-statistics-033021-014937","url":null,"abstract":"Statistical models defined by shape constraints are a valuable alternative to parametric models or nonparametric models defined in terms of quantitative smoothness constraints. While the latter two classes of models are typically difficult to justify a priori, many applications involve natural shape constraints, for instance, monotonicity of a density or regression function. We review some of the history of this subject and recent developments, with special emphasis on algorithmic aspects, adaptivity, honest confidence bands for shape-constrained curves, and distributional regression, i.e., inference about the conditional distribution of a real-valued response given certain covariates.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"20 18","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50164709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Variable Importance Without Impossible Data","authors":"Masayoshi Mase, Art B. Owen, Benjamin B. Seiler","doi":"10.1146/annurev-statistics-040722-045325","DOIUrl":"https://doi.org/10.1146/annurev-statistics-040722-045325","url":null,"abstract":"The most popular methods for measuring importance of the variables in a black-box prediction algorithm make use of synthetic inputs that combine predictor variables from multiple observations. These inputs can be unlikely, physically impossible, or even logically impossible. As a result, the predictions for such cases can be based on data very unlike any the black box was trained on. We think that users cannot trust an explanation of the decision of a prediction algorithm when the explanation uses such values. Instead, we advocate a method called cohort Shapley, which is grounded in economic game theory and uses only actually observed data to quantify variable importance. Cohort Shapley works by narrowing the cohort of observations judged to be similar to a target observation on one or more features. We illustrate it on an algorithmic fairness problem where it is essential to attribute importance to protected variables that the model was not trained on.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"16 12","pages":""},"PeriodicalIF":7.9,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50165107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}