用于优化护理过程反应的强化学习

IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Olusanmi A. Hundogan , Bart J. Verhoef , Patrick Theeven , Hajo A. Reijers , Xixi Lu
{"title":"用于优化护理过程反应的强化学习","authors":"Olusanmi A. Hundogan ,&nbsp;Bart J. Verhoef ,&nbsp;Patrick Theeven ,&nbsp;Hajo A. Reijers ,&nbsp;Xixi Lu","doi":"10.1016/j.datak.2025.102412","DOIUrl":null,"url":null,"abstract":"<div><div>Prescriptive process monitoring aims to derive recommendations for optimizing complex processes. While previous studies have successfully used reinforcement learning techniques to derive actionable policies in business processes, care processes present unique challenges due to their dynamic and multifaceted nature. For example, at any stage of a care process, a multitude of actions is possible. In this study, we follow the Reinforcement Learning (RL) approach and present a general approach that uses event data to build and train Markov decision processes. We proposed three algorithms including one that takes the elapsed time into account when transforming an event log into a semi-Markov decision process. We evaluated the RL approach using an aggression incident data set. Specifically, the goal is to optimize staff member actions when clients are displaying different types of aggressive behavior. The Q-learning and SARSA are used to find optimal policies. Our results showed that the derived policies align closely with current practices while offering alternative options in specific situations. By employing RL in the context of care processes, we contribute to the ongoing efforts to enhance decision-making and efficiency in dynamic and complex environments.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"157 ","pages":"Article 102412"},"PeriodicalIF":2.7000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement learning for optimizing responses in care processes\",\"authors\":\"Olusanmi A. Hundogan ,&nbsp;Bart J. Verhoef ,&nbsp;Patrick Theeven ,&nbsp;Hajo A. Reijers ,&nbsp;Xixi Lu\",\"doi\":\"10.1016/j.datak.2025.102412\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Prescriptive process monitoring aims to derive recommendations for optimizing complex processes. While previous studies have successfully used reinforcement learning techniques to derive actionable policies in business processes, care processes present unique challenges due to their dynamic and multifaceted nature. For example, at any stage of a care process, a multitude of actions is possible. In this study, we follow the Reinforcement Learning (RL) approach and present a general approach that uses event data to build and train Markov decision processes. We proposed three algorithms including one that takes the elapsed time into account when transforming an event log into a semi-Markov decision process. We evaluated the RL approach using an aggression incident data set. Specifically, the goal is to optimize staff member actions when clients are displaying different types of aggressive behavior. The Q-learning and SARSA are used to find optimal policies. Our results showed that the derived policies align closely with current practices while offering alternative options in specific situations. By employing RL in the context of care processes, we contribute to the ongoing efforts to enhance decision-making and efficiency in dynamic and complex environments.</div></div>\",\"PeriodicalId\":55184,\"journal\":{\"name\":\"Data & Knowledge Engineering\",\"volume\":\"157 \",\"pages\":\"Article 102412\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-02-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data & Knowledge Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169023X25000072\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X25000072","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

规定性过程监控旨在得出优化复杂过程的建议。虽然以前的研究已经成功地使用强化学习技术在业务流程中派生出可操作的策略,但护理流程由于其动态和多面性而面临独特的挑战。例如,在护理过程的任何阶段,都可能采取多种行动。在本研究中,我们遵循强化学习(RL)方法,并提出了一种使用事件数据构建和训练马尔可夫决策过程的通用方法。我们提出了三种算法,其中一种算法在将事件日志转换为半马尔可夫决策过程时考虑了经过的时间。我们使用攻击事件数据集来评估RL方法。具体来说,目标是在客户表现出不同类型的攻击行为时优化工作人员的行动。使用Q-learning和SARSA来寻找最优策略。我们的结果表明,衍生的政策与当前的实践密切相关,同时在特定情况下提供替代选项。通过在护理过程中采用强化学习,我们为不断努力提高动态和复杂环境中的决策和效率做出了贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Reinforcement learning for optimizing responses in care processes
Prescriptive process monitoring aims to derive recommendations for optimizing complex processes. While previous studies have successfully used reinforcement learning techniques to derive actionable policies in business processes, care processes present unique challenges due to their dynamic and multifaceted nature. For example, at any stage of a care process, a multitude of actions is possible. In this study, we follow the Reinforcement Learning (RL) approach and present a general approach that uses event data to build and train Markov decision processes. We proposed three algorithms including one that takes the elapsed time into account when transforming an event log into a semi-Markov decision process. We evaluated the RL approach using an aggression incident data set. Specifically, the goal is to optimize staff member actions when clients are displaying different types of aggressive behavior. The Q-learning and SARSA are used to find optimal policies. Our results showed that the derived policies align closely with current practices while offering alternative options in specific situations. By employing RL in the context of care processes, we contribute to the ongoing efforts to enhance decision-making and efficiency in dynamic and complex environments.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Data & Knowledge Engineering
Data & Knowledge Engineering 工程技术-计算机:人工智能
CiteScore
5.00
自引率
0.00%
发文量
66
审稿时长
6 months
期刊介绍: Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信