An Explainable Machine Learning Framework for Cross-Sectional Forecast-Based Fund Selection

ERN: Other Econometrics: Econometric & Statistical Methods - Special Topics (Topic) Pub Date : 2020-08-14 DOI:10.2139/ssrn.3707595

Giulio Trichilo

{"title":"An Explainable Machine Learning Framework for Cross-Sectional Forecast-Based Fund Selection","authors":"Giulio Trichilo","doi":"10.2139/ssrn.3707595","DOIUrl":null,"url":null,"abstract":"Since the 1990’s the global hedge fund industry has seen a rapid expansion. Its growing presence in financial markets ranging from equity, fixed income and derivative markets has inextricably linked it to the broader financial industry, with larger funds effectively acting as a market makers and liquidity providers in many markets. For both academics and practitioners, the space has established itself as a key area of research given the vast heterogeneity of investment styles and the high mutability of the industry. In this thesis, a cross sectional fund selection approach which builds upon the paradigm of explainable machine learning is proposed in a fully systematic setting. Four fund performance metrics, Sharpe and Sortino ratios, fund alpha and its t-statistic, are used as the ranking and selection metric, at investable inter-regime forecast horizons of 24 and 36 months. We find that quintile portfolios constructed from machine learning and deep learning approaches outperform linear models and benchmark portfolios constructed exclusively based on historical realizations of the forecast metric, in terms of absolute and risk-adjusted performance. We find that the extreme quintile portfolios realize a high (resp. low) value of the performance metric employed as forecast metric in model training. We find forecasting on the Sortino ratio to yield the most consistent overall performance, and find particular benefit in employing machine learning methods for bottom quintile fund selection (consistent identification of under-performers) in the case of forecasting on fund alpha. Explainability, achieved via the use of SHAP values further serves the purpose of outlining feature importance both at the aggregate and the individual fund level. At the aggregate level, all methods agree on a subset of statistically consistent predictors across investment style and forecast horizon; with discernible relevance of predictors constructed from interactions of fund returns with nowcasters, and management quality indicators. This consistency enables a discretionary fund selection process to be complemented by model forecasts and SHAP value-based feature importance delineations. There is thus evidence that proposed approach may be valuable for a discretionary fund manager looking to incorporate machine learning based signals into their selection process.","PeriodicalId":239853,"journal":{"name":"ERN: Other Econometrics: Econometric & Statistical Methods - Special Topics (Topic)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ERN: Other Econometrics: Econometric & Statistical Methods - Special Topics (Topic)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3707595","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Since the 1990’s the global hedge fund industry has seen a rapid expansion. Its growing presence in financial markets ranging from equity, fixed income and derivative markets has inextricably linked it to the broader financial industry, with larger funds effectively acting as a market makers and liquidity providers in many markets. For both academics and practitioners, the space has established itself as a key area of research given the vast heterogeneity of investment styles and the high mutability of the industry. In this thesis, a cross sectional fund selection approach which builds upon the paradigm of explainable machine learning is proposed in a fully systematic setting. Four fund performance metrics, Sharpe and Sortino ratios, fund alpha and its t-statistic, are used as the ranking and selection metric, at investable inter-regime forecast horizons of 24 and 36 months. We find that quintile portfolios constructed from machine learning and deep learning approaches outperform linear models and benchmark portfolios constructed exclusively based on historical realizations of the forecast metric, in terms of absolute and risk-adjusted performance. We find that the extreme quintile portfolios realize a high (resp. low) value of the performance metric employed as forecast metric in model training. We find forecasting on the Sortino ratio to yield the most consistent overall performance, and find particular benefit in employing machine learning methods for bottom quintile fund selection (consistent identification of under-performers) in the case of forecasting on fund alpha. Explainability, achieved via the use of SHAP values further serves the purpose of outlining feature importance both at the aggregate and the individual fund level. At the aggregate level, all methods agree on a subset of statistically consistent predictors across investment style and forecast horizon; with discernible relevance of predictors constructed from interactions of fund returns with nowcasters, and management quality indicators. This consistency enables a discretionary fund selection process to be complemented by model forecasts and SHAP value-based feature importance delineations. There is thus evidence that proposed approach may be valuable for a discretionary fund manager looking to incorporate machine learning based signals into their selection process.

查看原文本刊更多论文

基于横截面预测的基金选择的可解释的机器学习框架

自上世纪90年代以来，全球对冲基金行业迅速扩张。它在从股票、固定收益和衍生品市场等金融市场日益增长的影响力，将其与更广泛的金融行业不可分割地联系在一起，大型基金在许多市场上实际上扮演着做市商和流动性提供者的角色。对于学者和从业者来说，考虑到投资风格的巨大异质性和行业的高度可变性，这个领域已经成为一个关键的研究领域。在本文中，在一个完全系统的环境中，提出了一种建立在可解释机器学习范式基础上的横截面基金选择方法。在可投资的制度间预测期限为24个月和36个月时，采用夏普(Sharpe)和索蒂诺(Sortino)比率、基金alpha及其t统计量这四个基金业绩指标作为排名和选择指标。我们发现，在绝对和风险调整后的表现方面，由机器学习和深度学习方法构建的五分位数投资组合优于仅基于预测指标的历史实现构建的线性模型和基准投资组合。研究发现，极端五分位数组合具有较高的收益率。模型训练中用作预测度量的性能度量值偏低。我们发现对Sortino比率的预测可以产生最一致的整体表现，并且在预测基金alpha的情况下，在使用机器学习方法进行底部五分之一基金选择(一致识别表现不佳的基金)方面发现了特别的好处。通过使用SHAP值实现的可解释性进一步有助于概述总体和单个基金层面的特征重要性。在总体水平上，所有方法都同意在投资风格和预测范围内统计一致的预测因子子集;从基金回报与nowcaster和管理质量指标的相互作用中构建的预测因子具有明显的相关性。这种一致性使可自由支配的基金选择过程能够得到模型预测和基于SHAP价值的特征重要性描述的补充。因此，有证据表明，对于希望将基于机器学习的信号纳入其选择过程的全权委托基金经理来说，所提出的方法可能是有价值的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ERN: Other Econometrics: Econometric & Statistical Methods - Special Topics (Topic)

自引率

0.00%

发文量