通过与未经训练的强化学习代理互动，从错误中学习，从而加强医学培训

IF 4.9 3区教育学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

IEEE Transactions on Learning Technologies Pub Date : 2024-03-04 DOI:10.1109/TLT.2024.3372508

Yasar C. Kakdas;Sinan Kockara;Tansel Halic;Doga Demirel

{"title":"通过与未经训练的强化学习代理互动，从错误中学习，从而加强医学培训","authors":"Yasar C. Kakdas;Sinan Kockara;Tansel Halic;Doga Demirel","doi":"10.1109/TLT.2024.3372508","DOIUrl":null,"url":null,"abstract":"This article presents a 3-D medical simulation that employs reinforcement learning (RL) and interactive RL (IRL) to teach and assess the procedure of donning and doffing personal protective equipment (PPE). The simulation is motivated by the need for effective, safe, and remote training techniques in medicine, particularly in light of the COVID-19 pandemic. The simulation has two modes: a tutorial mode and an assessment mode. In the tutorial mode, a computer-based, ill-trained RL agent utilizes RL to learn the correct sequence of donning the PPE by trial and error. This allows students to experience many outlier cases they might not encounter in an in-class educational model. In the assessment mode, an IRL-based method is used to evaluate how effective the participant is at correcting the mistakes performed by the RL agent. Each time the RL agent interacts with the environment and performs an action, the participants provide positive or negative feedback regarding the action taken. Following the assessment, participants receive a score based on the accuracy of their feedback and the time taken for the RL agent to learn the correct sequence. An experiment was conducted using two groups, each consisting of ten participants. The first group received RL-assisted training for donning PPE, followed by an IRL-based assessment. Meanwhile, the second group observed a video featuring the RL agent demonstrating only the correct donning order without outlier cases, replicating traditional training, before undergoing the same assessment as the first group. Results showed that RL-assisted training with many outlier cases was more effective than traditional training with only regular cases. Moreover, combining RL with IRL significantly enhanced the participants' performance. Notably, 90% of the participants finished the assessment with perfect scores within three iterations. In contrast, only 10% of those who did not engage in RL-assisted training finished the assessment with a perfect score, highlighting the substantial impact of RL and IRL integration on participants’ overall achievement.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"17 ","pages":"1248-1260"},"PeriodicalIF":4.9000,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Medical Training Through Learning From Mistakes by Interacting With an Ill-Trained Reinforcement Learning Agent\",\"authors\":\"Yasar C. Kakdas;Sinan Kockara;Tansel Halic;Doga Demirel\",\"doi\":\"10.1109/TLT.2024.3372508\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article presents a 3-D medical simulation that employs reinforcement learning (RL) and interactive RL (IRL) to teach and assess the procedure of donning and doffing personal protective equipment (PPE). The simulation is motivated by the need for effective, safe, and remote training techniques in medicine, particularly in light of the COVID-19 pandemic. The simulation has two modes: a tutorial mode and an assessment mode. In the tutorial mode, a computer-based, ill-trained RL agent utilizes RL to learn the correct sequence of donning the PPE by trial and error. This allows students to experience many outlier cases they might not encounter in an in-class educational model. In the assessment mode, an IRL-based method is used to evaluate how effective the participant is at correcting the mistakes performed by the RL agent. Each time the RL agent interacts with the environment and performs an action, the participants provide positive or negative feedback regarding the action taken. Following the assessment, participants receive a score based on the accuracy of their feedback and the time taken for the RL agent to learn the correct sequence. An experiment was conducted using two groups, each consisting of ten participants. The first group received RL-assisted training for donning PPE, followed by an IRL-based assessment. Meanwhile, the second group observed a video featuring the RL agent demonstrating only the correct donning order without outlier cases, replicating traditional training, before undergoing the same assessment as the first group. Results showed that RL-assisted training with many outlier cases was more effective than traditional training with only regular cases. Moreover, combining RL with IRL significantly enhanced the participants' performance. Notably, 90% of the participants finished the assessment with perfect scores within three iterations. In contrast, only 10% of those who did not engage in RL-assisted training finished the assessment with a perfect score, highlighting the substantial impact of RL and IRL integration on participants’ overall achievement.\",\"PeriodicalId\":49191,\"journal\":{\"name\":\"IEEE Transactions on Learning Technologies\",\"volume\":\"17 \",\"pages\":\"1248-1260\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2024-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Learning Technologies\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10458319/\",\"RegionNum\":3,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Learning Technologies","FirstCategoryId":"95","ListUrlMain":"https://ieeexplore.ieee.org/document/10458319/","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

本文介绍了一种三维医学模拟，它采用强化学习（RL）和交互式 RL（IRL）来教授和评估穿脱个人防护设备（PPE）的程序。该模拟的动机是医学领域对有效、安全和远程培训技术的需求，尤其是在 COVID-19 大流行的情况下。模拟有两种模式：辅导模式和评估模式。在辅导模式中，一个基于计算机、训练有素的 RL 代理利用 RL，通过不断尝试和出错来学习穿戴个人防护设备的正确顺序。这样，学生就能体验到许多在课堂教育模式中可能不会遇到的异常情况。在评估模式中，使用基于 IRL 的方法来评估学员纠正 RL 代理所犯错误的效率。每当 RL 代理与环境交互并执行一项操作时，参与者都会就所执行的操作提供积极或消极的反馈。评估结束后，参与者会根据其反馈的准确性和 RL 代理学习正确序列所需的时间得到一个分数。实验分两组进行，每组有十名参与者。第一组接受穿戴个人防护设备的 RL 辅助培训，然后进行基于 IRL 的评估。与此同时，第二组在接受与第一组相同的评估之前，观看了一段视频，视频中的 RL 代理只演示了正确的穿戴顺序，而没有离群情况，这与传统的培训相同。结果表明，与仅使用常规案例的传统训练相比，使用大量离群案例的 RL 辅助训练更为有效。此外，将 RL 与 IRL 相结合还能显著提高学员的成绩。值得注意的是，90% 的学员在三次迭代中以满分完成了评估。相比之下，只有 10% 没有参加 RL 辅助训练的学员能以满分完成评估，这凸显了 RL 与 IRL 的结合对学员整体成绩的重大影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Enhancing Medical Training Through Learning From Mistakes by Interacting With an Ill-Trained Reinforcement Learning Agent

This article presents a 3-D medical simulation that employs reinforcement learning (RL) and interactive RL (IRL) to teach and assess the procedure of donning and doffing personal protective equipment (PPE). The simulation is motivated by the need for effective, safe, and remote training techniques in medicine, particularly in light of the COVID-19 pandemic. The simulation has two modes: a tutorial mode and an assessment mode. In the tutorial mode, a computer-based, ill-trained RL agent utilizes RL to learn the correct sequence of donning the PPE by trial and error. This allows students to experience many outlier cases they might not encounter in an in-class educational model. In the assessment mode, an IRL-based method is used to evaluate how effective the participant is at correcting the mistakes performed by the RL agent. Each time the RL agent interacts with the environment and performs an action, the participants provide positive or negative feedback regarding the action taken. Following the assessment, participants receive a score based on the accuracy of their feedback and the time taken for the RL agent to learn the correct sequence. An experiment was conducted using two groups, each consisting of ten participants. The first group received RL-assisted training for donning PPE, followed by an IRL-based assessment. Meanwhile, the second group observed a video featuring the RL agent demonstrating only the correct donning order without outlier cases, replicating traditional training, before undergoing the same assessment as the first group. Results showed that RL-assisted training with many outlier cases was more effective than traditional training with only regular cases. Moreover, combining RL with IRL significantly enhanced the participants' performance. Notably, 90% of the participants finished the assessment with perfect scores within three iterations. In contrast, only 10% of those who did not engage in RL-assisted training finished the assessment with a perfect score, highlighting the substantial impact of RL and IRL integration on participants’ overall achievement.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Learning Technologies COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-

CiteScore

7.50

自引率

5.40%

发文量

审稿时长

>12 weeks

期刊介绍： The IEEE Transactions on Learning Technologies covers all advances in learning technologies and their applications, including but not limited to the following topics: innovative online learning systems; intelligent tutors; educational games; simulation systems for education and training; collaborative learning tools; learning with mobile devices; wearable devices and interfaces for learning; personalized and adaptive learning systems; tools for formative and summative assessment; tools for learning analytics and educational data mining; ontologies for learning systems; standards and web services that support learning; authoring tools for learning materials; computer support for peer tutoring; learning via computer-mediated inquiry, field, and lab work; social learning techniques; social networks and infrastructures for learning and knowledge sharing; and creation and management of learning objects.