{"title":"Quantifying ESG alpha using scholar big data: an automated machine learning approach","authors":"Qian Chen, Xiao-Yang Liu","doi":"10.1145/3383455.3422529","DOIUrl":"https://doi.org/10.1145/3383455.3422529","url":null,"abstract":"ESG (Environmental, social and governance) alpha strategy that makes sustainable investment has gained popularity among investors. The ESG fields of study in scholar big data is a valuable alternative data that reflects a company's long-term ESG commitment. However, it is considered a difficulty to quantitatively measure a company's ESG premium and its impact to the company's stock price using scholar big data. In this paper, we utilize ESG scholar data as alternative data to develop an automatic trading strategy and propose a practical machine learning approach to quantify the ESG premium of a company and capture the ESG alpha. First, we construct our ESG investment universe and apply feature engineering on the companies' ESG scholar data from the Microsoft Academic Graph database. Then, we train six complementary machine learning models using a combination of financial indicators and ESG scholar data features and employ an ensemble method to predict stock prices and automatically set up portfolio allocation. Finally, we manage our portfolio, trade and rebalance the portfolio allocation monthly using predicted stock prices. We backtest our ESG alpha strategy and compare its performance with benchmarks. The proposed ESG alpha strategy achieves a cumulative return of 2,154.4% during the backtesting period of ten years, which significantly outperforms the NASDAQ-100 index's 397.4% and S&P 500's 226.9%. The traditional financial indicators results in only 1,443.7%, thus our scholar data-based ESG alpha strategy is better at capturing ESG premium than traditional financial indicators.","PeriodicalId":447950,"journal":{"name":"Proceedings of the First ACM International Conference on AI in Finance","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114688342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sig-SDEs model for quantitative finance","authors":"Imanol Perez Arribas, C. Salvi, L. Szpruch","doi":"10.1145/3383455.3422553","DOIUrl":"https://doi.org/10.1145/3383455.3422553","url":null,"abstract":"Mathematical models, calibrated to data, have become ubiquitous to make key decision processes in modern quantitative finance. In this work, we propose a novel framework for data-driven model selection by integrating a classical quantitative setup with a generative modelling approach. Leveraging the properties of the signature, a well-known path-transform from stochastic analysis that recently emerged as leading machine learning technology for learning time-series data, we develop the Sig-SDE model. Sig-SDE provides a new perspective on neural SDEs and can be calibrated to exotic financial products that depend, in a non-linear way, on the whole trajectory of asset prices. Furthermore, we our approach enables to consistently calibrate under the pricing measure Q and real-world measure P. Finally, we demonstrate the ability of Sig-SDE to simulate future possible market scenarios needed for computing risk profiles or hedging strategies. Importantly, this new model is underpinned by rigorous mathematical analysis, that under appropriate conditions provides theoretical guarantees for convergence of the presented algorithms.","PeriodicalId":447950,"journal":{"name":"Proceedings of the First ACM International Conference on AI in Finance","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125370424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Option hedging with risk averse reinforcement learning","authors":"Edoardo Vittori, M. Trapletti, Marcello Restelli","doi":"10.1145/3383455.3422532","DOIUrl":"https://doi.org/10.1145/3383455.3422532","url":null,"abstract":"In this paper we show how risk-averse reinforcement learning can be used to hedge options. We apply a state-of-the-art risk-averse algorithm: Trust Region Volatility Optimization (TRVO) to a vanilla option hedging environment, considering realistic factors such as discrete time and transaction costs. Realism makes the problem twofold: the agent must both minimize volatility and contain transaction costs, these tasks usually being in competition. We use the algorithm to train a sheaf of agents each characterized by a different risk aversion, so to be able to span an efficient frontier on the volatility-p&l space. The results show that the derived hedging strategy not only outperforms the Black & Scholes delta hedge, but is also extremely robust and flexible, as it can efficiently hedge options with different characteristics and work on markets with different behaviors than what was used in training.","PeriodicalId":447950,"journal":{"name":"Proceedings of the First ACM International Conference on AI in Finance","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127658171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine learning fund categorizations","authors":"D. Mehta, Dhruv Desai, Jithin Pradeep","doi":"10.1145/3383455.3422555","DOIUrl":"https://doi.org/10.1145/3383455.3422555","url":null,"abstract":"Given the surge in popularity of mutual funds (including exchange-traded funds (ETFs)) as a diversified financial investment, a vast variety of mutual funds from various investment management firms and diversification strategies have become available in the market. Identifying similar mutual funds among such a wide landscape of mutual funds has become more important than ever because of many applications ranging from sales and marketing to portfolio replication, portfolio diversification and tax loss harvesting. The current best method is data-vendor provided categorization which usually relies on curation by human experts with the help of available data. In this work, we establish that an industry wide well-regarded categorization system is learnable using machine learning and largely reproducible, and in turn constructing a truly data-driven categorization. We discuss the intellectual challenges in learning this man-made system, our results and their implications.","PeriodicalId":447950,"journal":{"name":"Proceedings of the First ACM International Conference on AI in Finance","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133364423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joana Lorenz, Maria Inês Silva, David Oliveira Aparício, João Tiago Ascensão, P. Bizarro
{"title":"Machine learning methods to detect money laundering in the bitcoin blockchain in the presence of label scarcity","authors":"Joana Lorenz, Maria Inês Silva, David Oliveira Aparício, João Tiago Ascensão, P. Bizarro","doi":"10.1145/3383455.3422549","DOIUrl":"https://doi.org/10.1145/3383455.3422549","url":null,"abstract":"Every year, criminals launder billions of dollars acquired from serious felonies (e.g., terrorism, drug smuggling, or human trafficking), harming countless people and economies. Cryptocurrencies, in particular, have developed as a haven for money laundering activity. Machine Learning can be used to detect these illicit patterns. However, labels are so scarce that traditional supervised algorithms are inapplicable. Here, we address money laundering detection assuming minimal access to labels. First, we show that existing state-of-the-art solutions using unsupervised anomaly detection methods are inadequate to detect the illicit patterns in a real Bitcoin transaction dataset. Then, we show that our proposed active learning solution is capable of matching the performance of a fully supervised baseline by using just 5% of the labels. This solution mimics a typical real-life situation in which a limited number of labels can be acquired through manual annotation by experts.","PeriodicalId":447950,"journal":{"name":"Proceedings of the First ACM International Conference on AI in Finance","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122573028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conditional mutual information-based contrastive loss for financial time series forecasting","authors":"Hanwei Wu, Ather Gattami, M. Flierl","doi":"10.1145/3383455.3422550","DOIUrl":"https://doi.org/10.1145/3383455.3422550","url":null,"abstract":"We present a representation learning framework for financial time series forecasting. One challenge of using deep learning models for finance forecasting is the shortage of available training data when using small datasets. Direct trend classification using deep neural networks trained on small datasets is susceptible to the overfitting problem. In this paper, we propose to first learn compact representations from time series data, then use the learned representations to train a simpler model for predicting time series movements. We consider a class-conditioned latent variable model. We train an encoder network to maximize the mutual information between the latent variables and the trend information conditioned on the encoded observed variables. We show that conditional mutual information maximization can be approximated by a contrastive loss. Then, the problem is transformed into a classification task of determining whether two encoded representations are sampled from the same class or not. This is equivalent to performing pairwise comparisons of the training datapoints, and thus, improves the generalization ability of the encoder network. We use deep autoregressive models as our encoder to capture long-term dependencies of the sequence data. Empirical experiments indicate that our proposed method has the potential to advance state-of-the-art performance.","PeriodicalId":447950,"journal":{"name":"Proceedings of the First ACM International Conference on AI in Finance","volume":"8 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114135328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal, truthful, and private securities lending","authors":"Emily Diana, Michael Kearns, S. Neel, Aaron Roth","doi":"10.1145/3383455.3422541","DOIUrl":"https://doi.org/10.1145/3383455.3422541","url":null,"abstract":"We consider a fundamental dynamic allocation problem motivated by the problem of securities lending in financial markets, the mechanism underlying the short selling of stocks. A lender would like to distribute a finite number of identical copies of some scarce resource to n clients, each of whom has a private demand that is unknown to the lender. The lender would like to maximize the usage of the resource --- avoiding allocating more to a client than her true demand --- but is constrained to sell the resource at a pre-specified price per unit, and thus cannot use prices to incentivize truthful reporting. We first show that the Bayesian optimal algorithm for the one-shot problem --- which maximizes the resource's expected usage according to the posterior expectation of demand, given reports --- actually incentivizes truthful reporting as a dominant strategy. Because true demands in the securities lending problem are often sensitive information that the client would like to hide from competitors, we then consider the problem under the additional desideratum of (joint) differential privacy. We give an algorithm, based on simple dynamics for computing market equilibria, that is simultaneously private, approximately optimal, and approximately dominant-strategy truthful. Finally, we leverage this private algorithm to construct an approximately truthful, optimal mechanism for the extensive form multi-round auction where the lender does not have access to the true joint distributions between clients' requests and demands.","PeriodicalId":447950,"journal":{"name":"Proceedings of the First ACM International Conference on AI in Finance","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130110006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Svitlana Vyetrenko, David Byrd, Nick Petosa, Mahmoud Mahfouz, Danial Dervovic, M. Veloso, T. Balch
{"title":"Get real: realism metrics for robust limit order book market simulations","authors":"Svitlana Vyetrenko, David Byrd, Nick Petosa, Mahmoud Mahfouz, Danial Dervovic, M. Veloso, T. Balch","doi":"10.1145/3383455.3422561","DOIUrl":"https://doi.org/10.1145/3383455.3422561","url":null,"abstract":"Market simulation is an increasingly important method for evaluating and training trading strategies and testing \"what if\" scenarios. The extent to which results from these simulations can be trusted depends on how realistic the environment is for the strategies being tested. As a step towards providing benchmarks for realistic simulated markets, we enumerate measurable stylized facts of limit order book (LOB) markets across multiple asset classes from the literature. We apply these metrics to data from real markets and compare the results to data originating from simulated markets. We illustrate their use in five different simulated market configurations: The first (market replay) is frequently used in practice to evaluate trading strategies; the other four are interactive agent based simulation (IABS) configurations which combine zero intelligence agents, and agents with limited strategic behavior. These simulated agents rely on an internal \"oracle\" that provides a fundamental value for the asset. In traditional IABS methods the fundamental originates from a mean reverting random walk. We show that markets exhibit more realistic behavior when the fundamental arises from historical market data. We further experimentally illustrate the effectiveness of IABS techniques as opposed to market replay.","PeriodicalId":447950,"journal":{"name":"Proceedings of the First ACM International Conference on AI in Finance","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121920089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enguerrand Horel, K. Giesecke, Victor Storchan, Naren Chittar
{"title":"Explainable clustering and application to wealth management compliance","authors":"Enguerrand Horel, K. Giesecke, Victor Storchan, Naren Chittar","doi":"10.1145/3383455.3422530","DOIUrl":"https://doi.org/10.1145/3383455.3422530","url":null,"abstract":"Many applications from the financial industry successfully leverage clustering algorithms to reveal meaningful patterns among a vast amount of unstructured financial data. However, these algorithms suffer from a lack of interpretability that is required both at a business and regulatory level. In order to overcome this issue, we propose a novel two-steps method to explain clusters. A classifier is first trained to predict the clusters labels, then the Single Feature Introduction Test (SFTT) method is run on the model to identify the statistically significant features that characterize each cluster. We describe a real wealth management compliance use-case that highlights the necessity of such an interpretable clustering method. We illustrate the performance of the method using simulated data and through an experiment on financial ratios of U.S. companies.","PeriodicalId":447950,"journal":{"name":"Proceedings of the First ACM International Conference on AI in Finance","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128736614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Trading via image classification","authors":"N. Cohen, T. Balch, M. Veloso","doi":"10.1145/3383455.3422544","DOIUrl":"https://doi.org/10.1145/3383455.3422544","url":null,"abstract":"The art of systematic financial trading evolved with an array of approaches, ranging from simple strategies to complex algorithms, all relying primarily on aspects of time-series analysis (e.g., Murphy, 1999; De Prado, 2018; Tsay, 2005). After visiting the trading floor of a leading financial institution, we noticed that traders always execute their trade orders while observing images of financial time-series on their screens. In this work, we build upon image recognition's success (e.g., Krizhevsky et al., 2012; Szegedy et al., 2015; Zeiler and Fergus, 2014; Wang et al., 2017; Koch et al., 2015; LeCun et al., 2015) and examine the value of transforming the traditional time-series analysis to that of image classification. We create a large sample of financial time-series images encoded as candlestick (Box and Whisker) charts and label the samples following three algebraically-defined binary trade strategies (Murphy, 1999). Using the images, we train over a dozen machine-learning classification models and find that the algorithms efficiently recover the complicated, multiscale label-generating rules when the data is visually represented. We suggest that the transformation of continuous numeric time-series classification problem to a vision problem is useful for recovering signals typical of technical analysis.","PeriodicalId":447950,"journal":{"name":"Proceedings of the First ACM International Conference on AI in Finance","volume":"292 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134601928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}