Jiayi Du, M. Jin, Petter N. Kolm, G. Ritter, Yixuan Wang, Bofei Zhang
{"title":"期权复制和套期保值的深度强化学习","authors":"Jiayi Du, M. Jin, Petter N. Kolm, G. Ritter, Yixuan Wang, Bofei Zhang","doi":"10.3905/JFDS.2020.1.045","DOIUrl":null,"url":null,"abstract":"The authors propose models for the solution of the fundamental problem of option replication subject to discrete trading, round lotting, and nonlinear transaction costs using state-of-the-art methods in deep reinforcement learning (DRL), including deep Q-learning, deep Q-learning with Pop-Art, and proximal policy optimization (PPO). Each DRL model is trained to hedge a whole range of strikes, and no retraining is needed when the user changes to another strike within the range. The models are general, allowing the user to plug in any option pricing and simulation library and then train them with no further modifications to hedge arbitrary option portfolios. Through a series of simulations, the authors show that the DRL models learn similar or better strategies as compared to delta hedging. Out of all models, PPO performs the best in terms of profit and loss, training time, and amount of data needed for training. TOPICS: Big data/machine learning, options, risk management, simulations Key Findings • The authors propose models for the replication of options over a whole range of strikes subject to discrete trading, round lotting, and nonlinear transaction costs based on state-of-the-art methods in deep reinforcement learning including deep Q-learning and proximal policy optimization. • The models allow the user to plug in any option pricing and simulation library and then train them with no further modifications to hedge arbitrary option portfolios. • A series of simulations demonstrates that the deep reinforcement learning models learn similar or better strategies as compared to delta hedging. • Proximal policy optimization outperforms the other models in terms of profit and loss, training time, and amount of data needed for training.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Deep Reinforcement Learning for Option Replication and Hedging\",\"authors\":\"Jiayi Du, M. Jin, Petter N. Kolm, G. Ritter, Yixuan Wang, Bofei Zhang\",\"doi\":\"10.3905/JFDS.2020.1.045\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The authors propose models for the solution of the fundamental problem of option replication subject to discrete trading, round lotting, and nonlinear transaction costs using state-of-the-art methods in deep reinforcement learning (DRL), including deep Q-learning, deep Q-learning with Pop-Art, and proximal policy optimization (PPO). Each DRL model is trained to hedge a whole range of strikes, and no retraining is needed when the user changes to another strike within the range. The models are general, allowing the user to plug in any option pricing and simulation library and then train them with no further modifications to hedge arbitrary option portfolios. Through a series of simulations, the authors show that the DRL models learn similar or better strategies as compared to delta hedging. Out of all models, PPO performs the best in terms of profit and loss, training time, and amount of data needed for training. TOPICS: Big data/machine learning, options, risk management, simulations Key Findings • The authors propose models for the replication of options over a whole range of strikes subject to discrete trading, round lotting, and nonlinear transaction costs based on state-of-the-art methods in deep reinforcement learning including deep Q-learning and proximal policy optimization. • The models allow the user to plug in any option pricing and simulation library and then train them with no further modifications to hedge arbitrary option portfolios. • A series of simulations demonstrates that the deep reinforcement learning models learn similar or better strategies as compared to delta hedging. • Proximal policy optimization outperforms the other models in terms of profit and loss, training time, and amount of data needed for training.\",\"PeriodicalId\":199045,\"journal\":{\"name\":\"The Journal of Financial Data Science\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Journal of Financial Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3905/JFDS.2020.1.045\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Financial Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3905/JFDS.2020.1.045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Reinforcement Learning for Option Replication and Hedging
The authors propose models for the solution of the fundamental problem of option replication subject to discrete trading, round lotting, and nonlinear transaction costs using state-of-the-art methods in deep reinforcement learning (DRL), including deep Q-learning, deep Q-learning with Pop-Art, and proximal policy optimization (PPO). Each DRL model is trained to hedge a whole range of strikes, and no retraining is needed when the user changes to another strike within the range. The models are general, allowing the user to plug in any option pricing and simulation library and then train them with no further modifications to hedge arbitrary option portfolios. Through a series of simulations, the authors show that the DRL models learn similar or better strategies as compared to delta hedging. Out of all models, PPO performs the best in terms of profit and loss, training time, and amount of data needed for training. TOPICS: Big data/machine learning, options, risk management, simulations Key Findings • The authors propose models for the replication of options over a whole range of strikes subject to discrete trading, round lotting, and nonlinear transaction costs based on state-of-the-art methods in deep reinforcement learning including deep Q-learning and proximal policy optimization. • The models allow the user to plug in any option pricing and simulation library and then train them with no further modifications to hedge arbitrary option portfolios. • A series of simulations demonstrates that the deep reinforcement learning models learn similar or better strategies as compared to delta hedging. • Proximal policy optimization outperforms the other models in terms of profit and loss, training time, and amount of data needed for training.