Using continuous reinforcement learning to obtain optimal dose of the drug in patients with melanoma during initial stage.

IF 1.6 4区医学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computer Methods in Biomechanics and Biomedical Engineering Pub Date : 2025-07-23 DOI:10.1080/10255842.2025.2519418

Elnaz Kalhor, Amin Noori, Sara Saboori Rad

{"title":"Using continuous reinforcement learning to obtain optimal dose of the drug in patients with melanoma during initial stage.","authors":"Elnaz Kalhor, Amin Noori, Sara Saboori Rad","doi":"10.1080/10255842.2025.2519418","DOIUrl":null,"url":null,"abstract":"<p><p>The most important issue, which is met in this paper is quick treatment of melanoma. Medically, melanoma is known as one of the most malignant types of cancers. This disease can put the patients in the risk of death, if no quick action is taken. Mostly, medical experts tolerate serious challenges to determine the optimal dose. Intelligent methods can pave this way and efficiently assist them to reliably provide the best suitable dose for quick treatment. The RL approach seems to be one of the best candidates. But, the conventional RL lacks of high accuracy and speed, due to discrete states and actions and may result in increased control effort. These drawbacks have directed us to adopt the continuous RL, a combination of NNs and the RL approach. This has increased the accuracy and optimality of the dose in a continuous state space to control and annihilate the population of cancer cells, while the complexity of the approach is significantly low. According to physicians, treatment of melanoma in its initial stage takes two months. After this period, cancer cells will be completely eliminated in the patient's body. Accordingly, a mathematical model of a patient with melanoma in initial stage is employed. The proposed method is analyzed using the Eligibility Traces algorithm, Q-learning algorithm and constant-dose injection method. The simulation results have indicated that when the combination of RL approach and NNs is adopted, after 50 days, the cancer cells will completely vanish. Besides, other parameters of the considered model will be within their normal range. However, when the Eligibility Traces and Q-learning algorithm is employed, after 50 days, cancer cells will be still present in the patient's body. When the proposed hybrid method is used, the injected dose is significantly lower than that of other methods. As a consequence, the side effects of the drug will be reduced. Finally, in this result, the effectiveness of the proposed approach is evaluated in 5 melanoma patients, under the presence of uncertainty and noise. The obtained results have confirmed the promising capability of the adopted approach to control the population of cancer cells and reach a desired level.</p>","PeriodicalId":50640,"journal":{"name":"Computer Methods in Biomechanics and Biomedical Engineering","volume":" ","pages":"1-26"},"PeriodicalIF":1.6000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Methods in Biomechanics and Biomedical Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1080/10255842.2025.2519418","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

The most important issue, which is met in this paper is quick treatment of melanoma. Medically, melanoma is known as one of the most malignant types of cancers. This disease can put the patients in the risk of death, if no quick action is taken. Mostly, medical experts tolerate serious challenges to determine the optimal dose. Intelligent methods can pave this way and efficiently assist them to reliably provide the best suitable dose for quick treatment. The RL approach seems to be one of the best candidates. But, the conventional RL lacks of high accuracy and speed, due to discrete states and actions and may result in increased control effort. These drawbacks have directed us to adopt the continuous RL, a combination of NNs and the RL approach. This has increased the accuracy and optimality of the dose in a continuous state space to control and annihilate the population of cancer cells, while the complexity of the approach is significantly low. According to physicians, treatment of melanoma in its initial stage takes two months. After this period, cancer cells will be completely eliminated in the patient's body. Accordingly, a mathematical model of a patient with melanoma in initial stage is employed. The proposed method is analyzed using the Eligibility Traces algorithm, Q-learning algorithm and constant-dose injection method. The simulation results have indicated that when the combination of RL approach and NNs is adopted, after 50 days, the cancer cells will completely vanish. Besides, other parameters of the considered model will be within their normal range. However, when the Eligibility Traces and Q-learning algorithm is employed, after 50 days, cancer cells will be still present in the patient's body. When the proposed hybrid method is used, the injected dose is significantly lower than that of other methods. As a consequence, the side effects of the drug will be reduced. Finally, in this result, the effectiveness of the proposed approach is evaluated in 5 melanoma patients, under the presence of uncertainty and noise. The obtained results have confirmed the promising capability of the adopted approach to control the population of cancer cells and reach a desired level.

查看原文本刊更多论文

使用持续强化学习获得初始阶段黑色素瘤患者的最佳药物剂量。

本文提出的最重要的问题是黑色素瘤的快速治疗。在医学上，黑色素瘤被认为是最恶性的癌症之一。如果不迅速采取行动，这种疾病会使病人有死亡的危险。大多数情况下，医学专家在确定最佳剂量方面面临严峻挑战。智能方法可以为此铺平道路，并有效地协助他们可靠地提供最合适的剂量以进行快速治疗。RL方法似乎是最好的选择之一。但是，由于状态和动作离散，传统的RL缺乏高精度和速度，并且可能导致控制工作量增加。这些缺点引导我们采用连续强化学习，即神经网络和强化学习方法的结合。这增加了在连续状态空间中控制和消灭癌细胞群的剂量的准确性和最优性，同时该方法的复杂性显着降低。据医生说，黑色素瘤初期的治疗需要两个月的时间。过了这段时间，癌细胞就会在病人体内被彻底消灭。因此，采用一种初始阶段黑色素瘤患者的数学模型。采用合格迹算法、q -学习算法和等剂量注射法对该方法进行了分析。仿真结果表明，当RL方法与神经网络相结合时，50天后癌细胞将完全消失。此外，所考虑的模型的其他参数将在其正常范围内。然而，当使用资格跟踪和q -学习算法时，50天后，癌细胞仍将存在于患者体内。当使用所提出的混合方法时，注射剂量明显低于其他方法。因此，药物的副作用将会减少。最后，在本结果中，在存在不确定性和噪声的情况下，在5例黑色素瘤患者中评估了所提出方法的有效性。所获得的结果证实了所采用的方法控制癌细胞数量并达到预期水平的良好能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Methods in Biomechanics and Biomedical Engineering 工程技术-工程：生物医学

CiteScore

4.10

自引率

6.20%

发文量

179

审稿时长

4-8 weeks

期刊介绍： The primary aims of Computer Methods in Biomechanics and Biomedical Engineering are to provide a means of communicating the advances being made in the areas of biomechanics and biomedical engineering and to stimulate interest in the continually emerging computer based technologies which are being applied in these multidisciplinary subjects. Computer Methods in Biomechanics and Biomedical Engineering will also provide a focus for the importance of integrating the disciplines of engineering with medical technology and clinical expertise. Such integration will have a major impact on health care in the future.