Optimizing Deep Reinforcement Learning for American Put Option Hedging

arXiv - QuantFin - Risk Management Pub Date : 2024-05-14 DOI:arxiv-2405.08602

Reilly Pickard, F. Wredenhagen, Y. Lawryshyn

{"title":"Optimizing Deep Reinforcement Learning for American Put Option Hedging","authors":"Reilly Pickard, F. Wredenhagen, Y. Lawryshyn","doi":"arxiv-2405.08602","DOIUrl":null,"url":null,"abstract":"This paper contributes to the existing literature on hedging American options\nwith Deep Reinforcement Learning (DRL). The study first investigates\nhyperparameter impact on hedging performance, considering learning rates,\ntraining episodes, neural network architectures, training steps, and\ntransaction cost penalty functions. Results highlight the importance of\navoiding certain combinations, such as high learning rates with a high number\nof training episodes or low learning rates with few training episodes and\nemphasize the significance of utilizing moderate values for optimal outcomes.\nAdditionally, the paper warns against excessive training steps to prevent\ninstability and demonstrates the superiority of a quadratic transaction cost\npenalty function over a linear version. This study then expands upon the work\nof Pickard et al. (2024), who utilize a Chebyshev interpolation option pricing\nmethod to train DRL agents with market calibrated stochastic volatility models.\nWhile the results of Pickard et al. (2024) showed that these DRL agents achieve\nsatisfactory performance on empirical asset paths, this study introduces a\nnovel approach where new agents at weekly intervals to newly calibrated\nstochastic volatility models. Results show DRL agents re-trained using weekly\nmarket data surpass the performance of those trained solely on the sale date.\nFurthermore, the paper demonstrates that both single-train and weekly-train DRL\nagents outperform the Black-Scholes Delta method at transaction costs of 1% and\n3%. This practical relevance suggests that practitioners can leverage readily\navailable market data to train DRL agents for effective hedging of options in\ntheir portfolios.","PeriodicalId":501128,"journal":{"name":"arXiv - QuantFin - Risk Management","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Risk Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.08602","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This paper contributes to the existing literature on hedging American options with Deep Reinforcement Learning (DRL). The study first investigates hyperparameter impact on hedging performance, considering learning rates, training episodes, neural network architectures, training steps, and transaction cost penalty functions. Results highlight the importance of avoiding certain combinations, such as high learning rates with a high number of training episodes or low learning rates with few training episodes and emphasize the significance of utilizing moderate values for optimal outcomes. Additionally, the paper warns against excessive training steps to prevent instability and demonstrates the superiority of a quadratic transaction cost penalty function over a linear version. This study then expands upon the work of Pickard et al. (2024), who utilize a Chebyshev interpolation option pricing method to train DRL agents with market calibrated stochastic volatility models. While the results of Pickard et al. (2024) showed that these DRL agents achieve satisfactory performance on empirical asset paths, this study introduces a novel approach where new agents at weekly intervals to newly calibrated stochastic volatility models. Results show DRL agents re-trained using weekly market data surpass the performance of those trained solely on the sale date. Furthermore, the paper demonstrates that both single-train and weekly-train DRL agents outperform the Black-Scholes Delta method at transaction costs of 1% and 3%. This practical relevance suggests that practitioners can leverage readily available market data to train DRL agents for effective hedging of options in their portfolios.

查看原文本刊更多论文

优化美式看跌期权对冲的深度强化学习

本文对利用深度强化学习（DRL）对冲美式期权的现有文献有所贡献。研究首先考虑了学习率、训练集、神经网络架构、训练步骤和交易成本惩罚函数等参数对对冲性能的影响。研究结果强调了避免某些组合的重要性，如高学习率和高训练集数，或低学习率和低训练集数，并强调了利用适度值获得最佳结果的重要性。此外，本文还警告不要采用过多的训练步数来防止不稳定性，并证明了二次交易成本惩罚函数比线性函数更优越。皮卡德等人（2024 年）利用切比雪夫插值期权定价法，用市场校准的随机波动率模型训练 DRL 代理，而本研究则在皮卡德等人（2024 年）的研究成果基础上进行了扩展。结果表明，使用每周市场数据重新训练的 DRL 代理的表现超过了仅根据销售日期训练的 DRL 代理。此外，本文还证明，在交易成本为 1%和 3%的情况下，单次训练和每周训练的 DRL 代理的表现均优于布莱克-斯科尔斯 Delta 方法。这种实际意义表明，从业人员可以利用现成的市场数据来训练 DRL 代理，以便在其投资组合中有效地对冲期权。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - QuantFin - Risk Management

自引率

0.00%

发文量