{"title":"深度惩罚方法:一类解决高维最优停止问题的深度学习算法","authors":"Yunfei Peng, Pengyu Wei, Wei Wei","doi":"arxiv-2405.11392","DOIUrl":null,"url":null,"abstract":"We propose a deep learning algorithm for high dimensional optimal stopping\nproblems. Our method is inspired by the penalty method for solving free\nboundary PDEs. Within our approach, the penalized PDE is approximated using the\nDeep BSDE framework proposed by \\cite{weinan2017deep}, which leads us to coin\nthe term \"Deep Penalty Method (DPM)\" to refer to our algorithm. We show that\nthe error of the DPM can be bounded by the loss function and\n$O(\\frac{1}{\\lambda})+O(\\lambda h) +O(\\sqrt{h})$, where $h$ is the step size in\ntime and $\\lambda$ is the penalty parameter. This finding emphasizes the need\nfor careful consideration when selecting the penalization parameter and\nsuggests that the discretization error converges at a rate of order\n$\\frac{1}{2}$. We validate the efficacy of the DPM through numerical tests\nconducted on a high-dimensional optimal stopping model in the area of American\noption pricing. The numerical tests confirm both the accuracy and the\ncomputational efficiency of our proposed algorithm.","PeriodicalId":501084,"journal":{"name":"arXiv - QuantFin - Mathematical Finance","volume":"68 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Penalty Methods: A Class of Deep Learning Algorithms for Solving High Dimensional Optimal Stopping Problems\",\"authors\":\"Yunfei Peng, Pengyu Wei, Wei Wei\",\"doi\":\"arxiv-2405.11392\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a deep learning algorithm for high dimensional optimal stopping\\nproblems. Our method is inspired by the penalty method for solving free\\nboundary PDEs. Within our approach, the penalized PDE is approximated using the\\nDeep BSDE framework proposed by \\\\cite{weinan2017deep}, which leads us to coin\\nthe term \\\"Deep Penalty Method (DPM)\\\" to refer to our algorithm. We show that\\nthe error of the DPM can be bounded by the loss function and\\n$O(\\\\frac{1}{\\\\lambda})+O(\\\\lambda h) +O(\\\\sqrt{h})$, where $h$ is the step size in\\ntime and $\\\\lambda$ is the penalty parameter. This finding emphasizes the need\\nfor careful consideration when selecting the penalization parameter and\\nsuggests that the discretization error converges at a rate of order\\n$\\\\frac{1}{2}$. We validate the efficacy of the DPM through numerical tests\\nconducted on a high-dimensional optimal stopping model in the area of American\\noption pricing. The numerical tests confirm both the accuracy and the\\ncomputational efficiency of our proposed algorithm.\",\"PeriodicalId\":501084,\"journal\":{\"name\":\"arXiv - QuantFin - Mathematical Finance\",\"volume\":\"68 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuantFin - Mathematical Finance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2405.11392\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Mathematical Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.11392","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Penalty Methods: A Class of Deep Learning Algorithms for Solving High Dimensional Optimal Stopping Problems
We propose a deep learning algorithm for high dimensional optimal stopping
problems. Our method is inspired by the penalty method for solving free
boundary PDEs. Within our approach, the penalized PDE is approximated using the
Deep BSDE framework proposed by \cite{weinan2017deep}, which leads us to coin
the term "Deep Penalty Method (DPM)" to refer to our algorithm. We show that
the error of the DPM can be bounded by the loss function and
$O(\frac{1}{\lambda})+O(\lambda h) +O(\sqrt{h})$, where $h$ is the step size in
time and $\lambda$ is the penalty parameter. This finding emphasizes the need
for careful consideration when selecting the penalization parameter and
suggests that the discretization error converges at a rate of order
$\frac{1}{2}$. We validate the efficacy of the DPM through numerical tests
conducted on a high-dimensional optimal stopping model in the area of American
option pricing. The numerical tests confirm both the accuracy and the
computational efficiency of our proposed algorithm.