The Journal of Financial Data Science最新文献_第10页

Matrix Evolutions: Synthetic Correlations and Explainable Machine Learning for Constructing Robust Investment Portfolios 矩阵演化:构建稳健投资组合的综合相关性和可解释的机器学习

The Journal of Financial Data Science Pub Date : 2020-07-29 DOI: 10.2139/ssrn.3663220

Jochen Papenbrock, Peter Schwendner, Markus Jaeger, Stephan Krügel

{"title":"Matrix Evolutions: Synthetic Correlations and Explainable Machine Learning for Constructing Robust Investment Portfolios","authors":"Jochen Papenbrock, Peter Schwendner, Markus Jaeger, Stephan Krügel","doi":"10.2139/ssrn.3663220","DOIUrl":"https://doi.org/10.2139/ssrn.3663220","url":null,"abstract":"In this article, the authors present a novel and highly flexible concept to simulate correlation matrixes of financial markets. It produces realistic outcomes regarding stylized facts of empirical correlation matrixes and requires no asset return input data. The matrix generation is based on a multiobjective evolutionary algorithm, so the authors call the approach matrix evolutions. It is suitable for parallel implementation and can be accelerated by graphics processing units and quantum-inspired algorithms. The approach is useful for backtesting, pricing, and hedging correlation-dependent investment strategies and financial products. Its potential is demonstrated in a machine learning case study for robust portfolio construction in a multi-asset universe: An explainable machine learning program links the synthetic matrixes to the portfolio volatility spread of hierarchical risk parity versus equal risk contribution. TOPICS: Statistical methods, big data/machine learning, portfolio construction, performance measurement Key Findings ▪ The authors introduce the matrix evolutions concept based on an evolutionary algorithm to simulate correlation matrixes useful for financial market applications. ▪ They apply the resulting synthetic correlation matrixes to benchmark hierarchical risk parity (HRP) and equal risk contribution allocations of a multi-asset futures portfolio and find HRP to show lower portfolio risk. ▪ The authors evaluate three competing machine learning methods to regress the portfolio risk spread between both allocation methods against statistical features of the synthetic correlation matrixes and then discuss the local and global feature importance using the SHAP framework by Lundberg and Lee (2017).","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"219 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122840835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Hyperparameter Optimization for Portfolio Selection 投资组合选择的超参数优化

The Journal of Financial Data Science Pub Date : 2020-06-18 DOI: 10.3905/jfds.2020.1.035

P. Nystrup, Erik Lindström, H. Madsen

{"title":"Hyperparameter Optimization for Portfolio Selection","authors":"P. Nystrup, Erik Lindström, H. Madsen","doi":"10.3905/jfds.2020.1.035","DOIUrl":"https://doi.org/10.3905/jfds.2020.1.035","url":null,"abstract":"Portfolio selection involves a trade-off between maximizing expected return and minimizing risk. In practice, useful formulations also include various costs and constraints that regularize the problem and reduce the risk due to estimation errors, resulting in solutions that depend on a number of hyperparameters. As the number of hyperparameters grows, selecting their value becomes increasingly important and difficult. In this article, the authors propose a systematic approach to hyperparameter optimization by leveraging recent advances in automated machine learning and multiobjective optimization. They optimize hyperparameters on a train set to yield the best result subject to market-determined realized costs. In applications to single- and multiperiod portfolio selection, they show that sequential hyperparameter optimization finds solutions with better risk–return trade-offs than manual, grid, and random search over hyperparameters using fewer function evaluations. At the same time, the solutions found are more stable from in-sample training to out-of-sample testing, suggesting they are less likely to be extremities that randomly happened to yield good performance in training. TOPICS: Portfolio theory, portfolio construction, big data/machine learning Key Findings • The growing number of applications of machine-learning approaches to portfolio selection means that hyperparameter optimization becomes increasingly important. We propose a systematic approach to hyperparameter optimization by leveraging recent advances in automated machine learning and multiobjective optimization. • We establish a connection between forecast uncertainty and holding- and trading-cost parameters in portfolio selection. We argue that they should be considered regularization parameters that can be adjusted in training to achieve optimal performance when tested subject to realized costs. • We show that multiobjective optimization can find solutions with better risk–return trade-offs than manual, grid, and random search over hyperparameters for portfolio selection. At the same time, the solutions are more stable across in-sample training and out-of-sample testing.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132841920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

The Cross Section of Commodity Returns: A Nonparametric Approach 商品收益的横截面:一个非参数方法

The Journal of Financial Data Science Pub Date : 2020-06-17 DOI: 10.3905/jfds.2020.1.034

C. Struck, Enoch Cheng

引用次数: 3

Deep Sequence Modeling: Development and Applications in Asset Pricing 深度序列建模:在资产定价中的发展与应用

The Journal of Financial Data Science Pub Date : 2020-06-01 DOI: 10.3905/jfds.2020.1.053

Lingbo Cong, Ke Tang, Jingyuan Wang, Yang Zhang

{"title":"Deep Sequence Modeling: Development and Applications in Asset Pricing","authors":"Lingbo Cong, Ke Tang, Jingyuan Wang, Yang Zhang","doi":"10.3905/jfds.2020.1.053","DOIUrl":"https://doi.org/10.3905/jfds.2020.1.053","url":null,"abstract":"The authors predict asset returns and measure risk premiums using a prominent technique from artificial intelligence: deep sequence modeling. Because asset returns often exhibit sequential dependence that may not be effectively captured by conventional time-series models, sequence modeling offers a promising path with its data-driven approach and superior performance. In this article, the authors first overview the development of deep sequence models, introduce their applications in asset pricing, and discuss their advantages and limitations. They then perform a comparative analysis of these methods using data on US equities. They demonstrate how sequence modeling benefits investors in general through incorporating complex historical path dependence and that long short-term memory–based models tend to have the best out-of-sample performance. TOPICS: Big data/machine learning, security analysis and valuation, performance measurement Key Findings ▪ This article provides a concise synopsis of deep sequence modeling with an emphasis on its historical development in the field of computer science and artificial intelligence. It serves as a reference source for social scientists who aim to use the tool to supplement conventional time-series and panel methods. ▪ Deep sequence models can be adapted successfully for asset pricing, especially in predicting asset returns, which allow the model to be flexible to capture the high-dimensionality, nonlinear, interactive, low signal-to-noise, and dynamic nature of financial data. In particular, the model’s ability to detect path-dependence patterns makes it versatile and effective, potentially outperforming existing models. ▪ This article provides a horse-race comparison of various deep sequence models for the tasks of forecasting returns and measuring risk premiums. Long short-term memory has the best performance in terms of out-of-sample predictive R2, and long short-term memory with an attention mechanism has the best portfolio performance when excluding microcap stocks.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134487544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Greedy Online Classification of Persistent Market States Using Realized Intraday Volatility Features 利用已实现的日内波动特征对持续市场状态进行贪婪在线分类

The Journal of Financial Data Science Pub Date : 2020-05-06 DOI: 10.2139/ssrn.3594875

P. Nystrup, Petter N. Kolm, Erik Lindström

{"title":"Greedy Online Classification of Persistent Market States Using Realized Intraday Volatility Features","authors":"P. Nystrup, Petter N. Kolm, Erik Lindström","doi":"10.2139/ssrn.3594875","DOIUrl":"https://doi.org/10.2139/ssrn.3594875","url":null,"abstract":"In many financial applications, it is important to classify time-series data without any latency while maintaining persistence in the identified states. The authors propose a greedy online classifier that contemporaneously determines which hidden state a new observation belongs to without the need to parse historical observations and without compromising persistence. Their classifier is based on the idea of clustering temporal features while explicitly penalizing jumps between states by a fixed-cost regularization term that can be calibrated to achieve a desired level of persistence. Through a series of return simulations, the authors show that in most settings their new classifier remarkably obtains a higher accuracy than the correctly specified maximum likelihood estimator. They illustrate that the new classifier is more robust to misspecification and yields state sequences that are significantly more persistent both in and out of sample. They demonstrate how classification accuracy can be further improved by including features that are based on intraday data. Finally, the authors apply the new classifier to estimate persistent states of the S&P 500 Index. TOPICS: Statistical methods, simulations, big data/machine learning Key Findings • A new greedy online classifier is proposed that contemporaneously determines which hidden state a new observation belongs to without the need to parse historical observations and without compromising temporal persistence. • A series of simulations demonstrates that the new classifier frequently obtains a higher accuracy and is more robust to misspecification than the correctly specified maximum likelihood estimator. • Classification accuracy can be improved by including features that are based on intraday volatility data.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"42 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123471248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Portfolio Selection Using Portfolio Committees 利用投资组合委员会进行投资组合选择

The Journal of Financial Data Science Pub Date : 2020-05-01 DOI: 10.2139/ssrn.3653595

Tsungwu Ho

引用次数: 0

Inside the Mind of Investors During the COVID-19 Pandemic: Evidence from the StockTwits Data COVID-19大流行期间投资者的内心:来自StockTwits数据的证据

The Journal of Financial Data Science Pub Date : 2020-04-23 DOI: 10.2139/ssrn.3583462

Hasan Fallahgoul

{"title":"Inside the Mind of Investors During the COVID-19 Pandemic: Evidence from the StockTwits Data","authors":"Hasan Fallahgoul","doi":"10.2139/ssrn.3583462","DOIUrl":"https://doi.org/10.2139/ssrn.3583462","url":null,"abstract":"The authors study investor beliefs—sentiment and disagreement—about stock market returns during the COVID-19 pandemic using a large number of investor messages, about 3.7 million, on a social media investing platform, StockTwits. The rich and multimodal features of StockTwits data allow the authors to explore the evolution of sentiment and disagreement within and across investors, sectors, and even industries. The authors find that sentiment (disagreement) has a sharp decrease (increase) across all investors with any investment philosophy, horizon, and experience between February 19, 2020, and March 23, 2020, when a historical market high was followed by a record drop. Surprisingly, these measures have a sharp reversal toward the end of March. However, the performance of these measures across various sectors is heterogeneous. Financial and healthcare sectors are the most pessimistic and optimistic divisions, respectively. TOPICS: Security analysis and valuation, quantitative methods, big data/machine learning, financial crises and financial market history, performance measurement Key Findings ▪ Daily time series of the sentiment and disagreement is not a stationary process. ▪ Sentiment (disagreement) has a sharp decrease (increase) across all investors with any investment philosophy, horizon, and experience between February 19, 2020, and March 23, 2020, when a historical market high was followed by a record drop. ▪ The financial and healthcare sectors are the most pessimistic and optimistic divisions, respectively.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125009825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

It’s All About Data: How to Make Good Decisions in a World Awash with Information 《一切都与数据有关:如何在信息泛滥的世界中做出正确的决定

The Journal of Financial Data Science Pub Date : 2020-03-04 DOI: 10.3905/jfds.2020.1.025

Mehrzad Mahdavi, Hossein Kazemi

{"title":"It’s All About Data: How to Make Good Decisions in a World Awash with Information","authors":"Mehrzad Mahdavi, Hossein Kazemi","doi":"10.3905/jfds.2020.1.025","DOIUrl":"https://doi.org/10.3905/jfds.2020.1.025","url":null,"abstract":"The rise of big and alternative data has created significant new business opportunities in the financial sector. As we start on this journey of fast-moving technology disruption, financial professionals have a rare opportunity to balance the exponential growth of artificial intelligence (AI)/data science with ethics, bias, and privacy to create trusted data-driven decision making. In this article, the authors discuss the nuances of big data sets that are critical when one considers standards, processes, best practices, and modeling algorithms for the deployment of AI systems. In addition, this industry is widely guided by a fiduciary standard that puts the interests of the client above all else. It is therefore critical to have a thorough understanding of the limitations of our knowledge, because there are many known unknowns and unknown unknowns that can have a significant impact on outcomes. The authors emphasize key success factors for the deployment of AI initiatives: talent and bridging the skills gap. To achieve a lasting impact of big data initiatives, multidisciplinary teams with well-defined roles need to be established with continuing training and education. The prize is the finance of the future. TOPICS: Simulations, big data/machine learning Key Findings • The rise of alternative data in finance is creating major opportunities in all areas of the financial industry, including risk management, portfolio construction, investment banking, and insurance. • To build trusted outcomes in AI/ML initiatives, financial professionals’ roles are critical. Given the many nuances in using big data, there is a need for vetted protocols and methods in selecting data sets and algorithms. Best practices and guidelines are effective in reducing the risks of using AI/ML, including overfitting, lack of interpretability, biased inputs, and unethical use of data. • Given the major shortage of talent in AI/data science in finance, practical training of employees and continued education are keys to scale roll out to enable future of finance.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132745407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PCA for Implied Volatility Surfaces 隐含波动率曲面的PCA

The Journal of Financial Data Science Pub Date : 2020-01-31 DOI: 10.3905/jfds.2020.1.032

M. Avellaneda, Brian F. Healy, A. Papanicolaou, G. Papanicolaou

{"title":"PCA for Implied Volatility Surfaces","authors":"M. Avellaneda, Brian F. Healy, A. Papanicolaou, G. Papanicolaou","doi":"10.3905/jfds.2020.1.032","DOIUrl":"https://doi.org/10.3905/jfds.2020.1.032","url":null,"abstract":"Principal component analysis (PCA) is a useful tool when trying to construct factor models from historical asset returns. For the implied volatilities of US equities, there is a PCA-based model with a principal eigenportfolio whose return time series lies close to that of an overarching market factor. The authors show that this market factor is the index resulting from the daily compounding of a weighted average of implied-volatility returns, with weights based on the options’ open interest and Vega. The authors also analyze the singular vectors derived from the tensor structure of the implied volatilities of S&P 500 constituents and find evidence indicating that some type of open interest- and Vega-weighted index should be one of at least two significant factors in this market. TOPICS: Statistical methods, simulations, big data/machine learning Key Findings • Principal component analysis of a comprehensive dataset of implied volatility surfaces from options on US equities shows that their collective behavior is captured by just nine factors, whereas the effective spatial dimension of the residuals is closer to 500 than to the nominal dimension of 28,000, revealing the large redundancy in the data. • Portfolios of implied volatility surface returns, weighed suitably by open interest and Vega, track the principal eigenportfolio associated with a market portfolio of options, in analogy to equity portfolios. • Retention of the tensor structure in the eigenportfolio analysis improves the tracking between the open interest–Vega weighted (tensor) implied volatility surface returns portfolio and the (tensor) eigenportfolio, indicating that data structure matters.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126511353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Managing Editor’s Letter 总编辑的信

The Journal of Financial Data Science Pub Date : 2020-01-31 DOI: 10.3905/jfds.2020.2.1.001

Francesco A. Fabozzi

{"title":"Managing Editor’s Letter","authors":"Francesco A. Fabozzi","doi":"10.3905/jfds.2020.2.1.001","DOIUrl":"https://doi.org/10.3905/jfds.2020.2.1.001","url":null,"abstract":"robert dunn General Manager The four issues of the 2019 inaugural publication of The Journal of Financial Data Science by all metrics indicate the success of the journal. Four of the articles published in JFDS were in the top 10 most downloaded articles across the Portfolio Management Research (PMR) platform. This is quite an accomplishment considering that JFDS represented just one year of articles. After publication of the first issue, articles in JFDS were featured in an opinion piece on the challenges of implementing machine learning by David Stevenson (“Machine Learning Revolution is Still Some Way Off”) published in the Financial Times. One of the articles in the inaugural issue is highlighted by Bill Kelly, the CEO of the CAIA Association, in an August 2019 blog (“Whatfore Art Thou Use of Alt-Data?”) in AllAboutAlpha. The Financial Data Professional Institute (FDPI), established by the CAIA Association, will be adopting at least f ive articles from JFDS as required reading for their membership exams. As researchers in this space produce papers, our expectation is that the journal will be well cited. In the first issue of Volume 2, there are nine articles which are summarized below. “Machine Learning in Asset Management—Part 1: Portfolio Construction—Trading Strategies” is the first in a series of articles by Derek Snow dealing with machine learning in asset management. The series will cover the applications to the major tasks of asset management: (1) portfolio construction, (2) risk management, (3) capital management, (4) infrastructure and deployment, and (5) sales and marketing. Portfolio construction is divided into trading and weight optimization. The primary focus of the current article is on how machine learning can be used to improve various types of trading strategies, while weight optimization is the subject of the next article in the series. Snow classifies trading strategies according to their respective machine-learning frameworks (i.e., reinforcement, supervised and unsupervised learning). He then explains the difference between reinforcement learning and supervised learning, both conceptually and in relation to their respective advantages and disadvantages. Global equity and bond asset management require techniques that also put effort into understanding the structure of the interactions. Network analysis offers asset managers insightful information regarding factor-based connectedness, relationships, and how risk is transferred between network components. Gueorgui Konstantinov and Mario Rusev demonstrate the relation between global equity and bond funds from a network perspective. In their article, “The Bond–Equity–Fund Relation Using the Fama–French–Carhart Factors: A Practical Network Approach,” they show the advantages of graph theory to explain the collective b y gu es t o n Fe br ua ry 5 , 2 02 1. C op yr ig ht 2 02 0 Pa ge an t M ed ia L td .","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132549573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0