基于深度强化学习的可变年金伪无模型套期保值

IF 1 Q3 BUSINESS, FINANCE

Annals of Actuarial Science Pub Date : 2021-07-07 DOI:10.1017/s1748499523000027

W. F. Chong, Haoen Cui, Yuxuan Li

{"title":"基于深度强化学习的可变年金伪无模型套期保值","authors":"W. F. Chong, Haoen Cui, Yuxuan Li","doi":"10.1017/s1748499523000027","DOIUrl":null,"url":null,"abstract":"This paper proposes a two-phase deep reinforcement learning approach, for hedging variable annuity contracts with both GMMB and GMDB riders, which can address model miscalibration in Black-Scholes financial and constant force of mortality actuarial market environments. In the training phase, an infant reinforcement learning agent interacts with a pre-designed training environment, collects sequential anchor-hedging reward signals, and gradually learns how to hedge the contracts. As expected, after a sufficient number of training steps, the trained reinforcement learning agent hedges, in the training environment, equally well as the correct Delta while outperforms misspecified Deltas. In the online learning phase, the trained reinforcement learning agent interacts with the market environment in real time, collects single terminal reward signals, and self-revises its hedging strategy. The hedging performance of the further trained reinforcement learning agent is demonstrated via an illustrative example on a rolling basis to reveal the self-revision capability on the hedging strategy by online learning.","PeriodicalId":44135,"journal":{"name":"Annals of Actuarial Science","volume":" ","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2021-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Pseudo-model-free hedging for variable annuities via deep reinforcement learning\",\"authors\":\"W. F. Chong, Haoen Cui, Yuxuan Li\",\"doi\":\"10.1017/s1748499523000027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a two-phase deep reinforcement learning approach, for hedging variable annuity contracts with both GMMB and GMDB riders, which can address model miscalibration in Black-Scholes financial and constant force of mortality actuarial market environments. In the training phase, an infant reinforcement learning agent interacts with a pre-designed training environment, collects sequential anchor-hedging reward signals, and gradually learns how to hedge the contracts. As expected, after a sufficient number of training steps, the trained reinforcement learning agent hedges, in the training environment, equally well as the correct Delta while outperforms misspecified Deltas. In the online learning phase, the trained reinforcement learning agent interacts with the market environment in real time, collects single terminal reward signals, and self-revises its hedging strategy. The hedging performance of the further trained reinforcement learning agent is demonstrated via an illustrative example on a rolling basis to reveal the self-revision capability on the hedging strategy by online learning.\",\"PeriodicalId\":44135,\"journal\":{\"name\":\"Annals of Actuarial Science\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2021-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Actuarial Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1017/s1748499523000027\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BUSINESS, FINANCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Actuarial Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/s1748499523000027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}

引用次数: 2

摘要

本文提出了一种两阶段深度强化学习方法，用于对冲GMMB和GMDB骑手的可变年金合约，该方法可以解决Black-Scholes金融和死亡率恒力精算市场环境下的模型校准问题。在训练阶段，婴儿强化学习智能体与预先设计的训练环境交互，收集顺序锚定对冲奖励信号，并逐渐学习如何对冲契约。正如预期的那样，经过足够数量的训练步骤后，训练的强化学习代理在训练环境中对冲的效果与正确的Delta一样好，同时优于错误指定的Delta。在在线学习阶段，训练后的强化学习智能体实时与市场环境交互，收集单端奖励信号，并自我修正对冲策略。通过一个滚动的示例来展示进一步训练的强化学习代理的套期保值性能，以揭示在线学习对套期保值策略的自我修正能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Pseudo-model-free hedging for variable annuities via deep reinforcement learning

This paper proposes a two-phase deep reinforcement learning approach, for hedging variable annuity contracts with both GMMB and GMDB riders, which can address model miscalibration in Black-Scholes financial and constant force of mortality actuarial market environments. In the training phase, an infant reinforcement learning agent interacts with a pre-designed training environment, collects sequential anchor-hedging reward signals, and gradually learns how to hedge the contracts. As expected, after a sufficient number of training steps, the trained reinforcement learning agent hedges, in the training environment, equally well as the correct Delta while outperforms misspecified Deltas. In the online learning phase, the trained reinforcement learning agent interacts with the market environment in real time, collects single terminal reward signals, and self-revises its hedging strategy. The hedging performance of the further trained reinforcement learning agent is demonstrated via an illustrative example on a rolling basis to reveal the self-revision capability on the hedging strategy by online learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Annals of Actuarial Science ECONOMICS-

CiteScore

3.10

自引率

5.90%

发文量