使用时间人工智能模型在模拟环境中对技能进行微观评估。

IF 3.3 2区 教育学 Q1 EDUCATION, SCIENTIFIC DISCIPLINES
Iben Bang Andersen, Morten Bo Søndergaard Svendsen, Anne Line Risgaard, Christian Sander Danstrup, Tobias Todsen, Martin G Tolsgaard, Mikkel Lønborg Friis
{"title":"使用时间人工智能模型在模拟环境中对技能进行微观评估。","authors":"Iben Bang Andersen, Morten Bo Søndergaard Svendsen, Anne Line Risgaard, Christian Sander Danstrup, Tobias Todsen, Martin G Tolsgaard, Mikkel Lønborg Friis","doi":"10.1080/0142159X.2025.2555353","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Assessing skills in simulated settings is resource-intensive and lacks validated metrics. Advances in AI offer the potential for automated competence assessment, addressing these limitations. This study aimed to develop and validate a machine learning AI model for automated evaluation during simulation-based thyroid ultrasound (US) training.</p><p><strong>Methods: </strong>Videos from eight experts and 21 novices performing thyroid US on a simulator were analyzed. Frames were processed into sequences of 1, 10, and 50 seconds. A convolutional neural network with a pre-trained ResNet-50 base and a long short-term memory layer analyzed these sequences. The model was trained to distinguish competence levels (competent=1, not competent=0) using fourfold cross-validation, with performance metrics including precision, recall, F1 score, and accuracy. Bayesian updating and adaptive thresholding assessed performance over time.</p><p><strong>Results: </strong>The AI model effectively differentiated expert and novice US performance. The 50-second sequences achieved the highest accuracy (70%) and F1 score (0.76). Experts showed significantly longer durations above the threshold (15.71s) compared to novices (9.31s, p= .030).</p><p><strong>Conclusions: </strong>A long short-term memory-based AI model provides near real-time, automated assessments of competence in US training. Utilizing temporal video data enables detailed micro-assessments of complex procedures, which may enhance interpretability and be applied across various procedural domains.</p>","PeriodicalId":18643,"journal":{"name":"Medical Teacher","volume":" ","pages":"1-10"},"PeriodicalIF":3.3000,"publicationDate":"2025-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enabling micro-assessments of skills in the simulated setting using temporal artificial intelligence-models.\",\"authors\":\"Iben Bang Andersen, Morten Bo Søndergaard Svendsen, Anne Line Risgaard, Christian Sander Danstrup, Tobias Todsen, Martin G Tolsgaard, Mikkel Lønborg Friis\",\"doi\":\"10.1080/0142159X.2025.2555353\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Assessing skills in simulated settings is resource-intensive and lacks validated metrics. Advances in AI offer the potential for automated competence assessment, addressing these limitations. This study aimed to develop and validate a machine learning AI model for automated evaluation during simulation-based thyroid ultrasound (US) training.</p><p><strong>Methods: </strong>Videos from eight experts and 21 novices performing thyroid US on a simulator were analyzed. Frames were processed into sequences of 1, 10, and 50 seconds. A convolutional neural network with a pre-trained ResNet-50 base and a long short-term memory layer analyzed these sequences. The model was trained to distinguish competence levels (competent=1, not competent=0) using fourfold cross-validation, with performance metrics including precision, recall, F1 score, and accuracy. Bayesian updating and adaptive thresholding assessed performance over time.</p><p><strong>Results: </strong>The AI model effectively differentiated expert and novice US performance. The 50-second sequences achieved the highest accuracy (70%) and F1 score (0.76). Experts showed significantly longer durations above the threshold (15.71s) compared to novices (9.31s, p= .030).</p><p><strong>Conclusions: </strong>A long short-term memory-based AI model provides near real-time, automated assessments of competence in US training. Utilizing temporal video data enables detailed micro-assessments of complex procedures, which may enhance interpretability and be applied across various procedural domains.</p>\",\"PeriodicalId\":18643,\"journal\":{\"name\":\"Medical Teacher\",\"volume\":\" \",\"pages\":\"1-10\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical Teacher\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://doi.org/10.1080/0142159X.2025.2555353\",\"RegionNum\":2,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION, SCIENTIFIC DISCIPLINES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Teacher","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1080/0142159X.2025.2555353","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0

摘要

背景:在模拟环境中评估技能是资源密集型的,缺乏有效的指标。人工智能的进步为自动化能力评估提供了潜力,解决了这些限制。本研究旨在开发和验证一种机器学习人工智能模型,用于基于模拟的甲状腺超声(US)训练期间的自动评估。方法:对8名专家和21名新手在模拟器上进行甲状腺超声成像的视频进行分析。帧被处理成1秒、10秒和50秒的序列。一个具有预训练的ResNet-50基和长短期记忆层的卷积神经网络分析了这些序列。使用四重交叉验证对模型进行训练,以区分能力水平(胜任=1,不胜任=0),性能指标包括精度,召回率,F1分数和准确性。随着时间的推移,贝叶斯更新和自适应阈值评估性能。结果:人工智能模型有效区分了专家和新手的表现。50秒序列的准确率最高(70%),F1得分最高(0.76)。专家在阈值以上的持续时间(15.71秒)明显长于新手(9.31秒,p= 0.030)。结论:基于长短期记忆的人工智能模型为美国培训提供了近乎实时的自动能力评估。利用时间视频数据可以对复杂程序进行详细的微观评估,这可以提高可解释性并适用于各种程序领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enabling micro-assessments of skills in the simulated setting using temporal artificial intelligence-models.

Background: Assessing skills in simulated settings is resource-intensive and lacks validated metrics. Advances in AI offer the potential for automated competence assessment, addressing these limitations. This study aimed to develop and validate a machine learning AI model for automated evaluation during simulation-based thyroid ultrasound (US) training.

Methods: Videos from eight experts and 21 novices performing thyroid US on a simulator were analyzed. Frames were processed into sequences of 1, 10, and 50 seconds. A convolutional neural network with a pre-trained ResNet-50 base and a long short-term memory layer analyzed these sequences. The model was trained to distinguish competence levels (competent=1, not competent=0) using fourfold cross-validation, with performance metrics including precision, recall, F1 score, and accuracy. Bayesian updating and adaptive thresholding assessed performance over time.

Results: The AI model effectively differentiated expert and novice US performance. The 50-second sequences achieved the highest accuracy (70%) and F1 score (0.76). Experts showed significantly longer durations above the threshold (15.71s) compared to novices (9.31s, p= .030).

Conclusions: A long short-term memory-based AI model provides near real-time, automated assessments of competence in US training. Utilizing temporal video data enables detailed micro-assessments of complex procedures, which may enhance interpretability and be applied across various procedural domains.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Medical Teacher
Medical Teacher 医学-卫生保健
CiteScore
7.80
自引率
8.50%
发文量
396
审稿时长
3-6 weeks
期刊介绍: Medical Teacher provides accounts of new teaching methods, guidance on structuring courses and assessing achievement, and serves as a forum for communication between medical teachers and those involved in general education. In particular, the journal recognizes the problems teachers have in keeping up-to-date with the developments in educational methods that lead to more effective teaching and learning at a time when the content of the curriculum—from medical procedures to policy changes in health care provision—is also changing. The journal features reports of innovation and research in medical education, case studies, survey articles, practical guidelines, reviews of current literature and book reviews. All articles are peer reviewed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信