引导蒙特卡罗树搜索算法在放射治疗中束向选择的强化学习应用。

Machine Learning: Science and Technology Pub Date : 2021-09-01 Epub Date: 2021-05-13 DOI:10.1088/2632-2153/abe528

Azar Sadeghnejad-Barkousaraie, Gyanendra Bohara, Steve Jiang, Dan Nguyen

{"title":"引导蒙特卡罗树搜索算法在放射治疗中束向选择的强化学习应用。","authors":"Azar Sadeghnejad-Barkousaraie, Gyanendra Bohara, Steve Jiang, Dan Nguyen","doi":"10.1088/2632-2153/abe528","DOIUrl":null,"url":null,"abstract":"Current beam orientation optimization algorithms for radiotherapy, such as column generation (CG), are typically heuristic or greedy in nature because of the size of the combinatorial problem, which leads to suboptimal solutions. We propose a reinforcement learning strategy using Monte Carlo Tree Search that can find a better beam orientation set in less time than CG. We utilize a reinforcement learning structure involving a supervised learning network to guide the Monte Carlo Tree Search and to explore the decision space of beam orientation selection problems. We previously trained a deep neural network (DNN) that takes in the patient anatomy, organ weights, and current beams, then approximates beam fitness values to indicate the next best beam to add. Here, we use this DNN to probabilistically guide the traversal of the branches of the Monte Carlo decision tree to add a new beam to the plan. To assess the feasibility of the algorithm, we used a test set of 13 prostate cancer patients, distinct from the 57 patients originally used to train and validate the DNN, to solve for 5-beam plans. To show the strength of the guided Monte Carlo tree search (GTS) compared to other search methods, we also provided the performances of guided search, uniform tree search and random search algorithms. On average, GTS outperformed all other methods. It found a better solution than CG in 237 seconds on average, compared to 360 seconds for CG, and outperformed all other methods in finding a solution with a lower objective function value in less than 1000 seconds. Using our guided tree search (GTS) method, we could maintain planning target volume (PTV) coverage within 1% error similar to CG, while reducing the organ-at-risk (OAR) mean dose for body, rectum, left and right femoral heads; mean dose to bladder was 1% higher with GTS than with CG.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9370063/pdf/nihms-1774939.pdf","citationCount":"9","resultStr":"{\"title\":\"A reinforcement learning application of a guided Monte Carlo Tree Search algorithm for beam orientation selection in radiation therapy.\",\"authors\":\"Azar Sadeghnejad-Barkousaraie, Gyanendra Bohara, Steve Jiang, Dan Nguyen\",\"doi\":\"10.1088/2632-2153/abe528\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current beam orientation optimization algorithms for radiotherapy, such as column generation (CG), are typically heuristic or greedy in nature because of the size of the combinatorial problem, which leads to suboptimal solutions. We propose a reinforcement learning strategy using Monte Carlo Tree Search that can find a better beam orientation set in less time than CG. We utilize a reinforcement learning structure involving a supervised learning network to guide the Monte Carlo Tree Search and to explore the decision space of beam orientation selection problems. We previously trained a deep neural network (DNN) that takes in the patient anatomy, organ weights, and current beams, then approximates beam fitness values to indicate the next best beam to add. Here, we use this DNN to probabilistically guide the traversal of the branches of the Monte Carlo decision tree to add a new beam to the plan. To assess the feasibility of the algorithm, we used a test set of 13 prostate cancer patients, distinct from the 57 patients originally used to train and validate the DNN, to solve for 5-beam plans. To show the strength of the guided Monte Carlo tree search (GTS) compared to other search methods, we also provided the performances of guided search, uniform tree search and random search algorithms. On average, GTS outperformed all other methods. It found a better solution than CG in 237 seconds on average, compared to 360 seconds for CG, and outperformed all other methods in finding a solution with a lower objective function value in less than 1000 seconds. Using our guided tree search (GTS) method, we could maintain planning target volume (PTV) coverage within 1% error similar to CG, while reducing the organ-at-risk (OAR) mean dose for body, rectum, left and right femoral heads; mean dose to bladder was 1% higher with GTS than with CG.\",\"PeriodicalId\":503691,\"journal\":{\"name\":\"Machine Learning: Science and Technology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9370063/pdf/nihms-1774939.pdf\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Learning: Science and Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1088/2632-2153/abe528\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2021/5/13 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning: Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/abe528","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/5/13 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

摘要

目前用于放射治疗的光束定向优化算法，如柱生成(CG)，由于组合问题的大小，通常是启发式的或贪婪的，这会导致次优解。我们提出了一种使用蒙特卡罗树搜索的强化学习策略，该策略可以在比CG更短的时间内找到更好的光束方向集。我们利用一个包含监督学习网络的强化学习结构来指导蒙特卡洛树搜索，并探索光束方向选择问题的决策空间。我们之前训练了一个深度神经网络(DNN)，该网络接受患者解剖结构、器官重量和当前光束，然后近似光束适应度值以指示下一个要添加的最佳光束。在这里，我们使用该DNN来概率引导蒙特卡罗决策树分支的遍历，以向计划添加新光束。为了评估算法的可行性，我们使用了13名前列腺癌患者的测试集来解决5束方案，而不是最初用于训练和验证DNN的57名患者。为了展示引导蒙特卡罗树搜索(GTS)与其他搜索方法相比的优势，我们还提供了引导搜索、均匀树搜索和随机搜索算法的性能。平均而言，GTS优于所有其他方法。与CG的360秒相比，它平均在237秒内找到了比CG更好的解决方案，并且在不到1000秒的时间内找到了目标函数值更低的解决方案，优于所有其他方法。使用我们的引导树搜索(GTS)方法，我们可以将计划靶体积(PTV)覆盖率保持在与CG相似的1%误差内，同时降低身体、直肠、左右股骨头的器官危险(OAR)平均剂量;GTS组膀胱平均剂量比CG组高1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

A reinforcement learning application of a guided Monte Carlo Tree Search algorithm for beam orientation selection in radiation therapy.

查看原文本刊更多论文

A reinforcement learning application of a guided Monte Carlo Tree Search algorithm for beam orientation selection in radiation therapy.

Current beam orientation optimization algorithms for radiotherapy, such as column generation (CG), are typically heuristic or greedy in nature because of the size of the combinatorial problem, which leads to suboptimal solutions. We propose a reinforcement learning strategy using Monte Carlo Tree Search that can find a better beam orientation set in less time than CG. We utilize a reinforcement learning structure involving a supervised learning network to guide the Monte Carlo Tree Search and to explore the decision space of beam orientation selection problems. We previously trained a deep neural network (DNN) that takes in the patient anatomy, organ weights, and current beams, then approximates beam fitness values to indicate the next best beam to add. Here, we use this DNN to probabilistically guide the traversal of the branches of the Monte Carlo decision tree to add a new beam to the plan. To assess the feasibility of the algorithm, we used a test set of 13 prostate cancer patients, distinct from the 57 patients originally used to train and validate the DNN, to solve for 5-beam plans. To show the strength of the guided Monte Carlo tree search (GTS) compared to other search methods, we also provided the performances of guided search, uniform tree search and random search algorithms. On average, GTS outperformed all other methods. It found a better solution than CG in 237 seconds on average, compared to 360 seconds for CG, and outperformed all other methods in finding a solution with a lower objective function value in less than 1000 seconds. Using our guided tree search (GTS) method, we could maintain planning target volume (PTV) coverage within 1% error similar to CG, while reducing the organ-at-risk (OAR) mean dose for body, rectum, left and right femoral heads; mean dose to bladder was 1% higher with GTS than with CG.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine Learning: Science and Technology

自引率

0.00%

发文量