Automatic Final-Product Assessment of Virtual Reality Mastoidectomy Performance: A Validity and Reliability Study.

IF 1.9 3区医学 Q3 CLINICAL NEUROLOGY

Otology & Neurotology Pub Date : 2025-01-01 Epub Date: 2024-11-06 DOI:10.1097/MAO.0000000000004346

Peter Trier Mikkelsen, Mads Sølvsten Sørensen, Pascal Senn, Andreas Frithioff, Steven Arild Wuyts Andersen

{"title":"Automatic Final-Product Assessment of Virtual Reality Mastoidectomy Performance: A Validity and Reliability Study.","authors":"Peter Trier Mikkelsen, Mads Sølvsten Sørensen, Pascal Senn, Andreas Frithioff, Steven Arild Wuyts Andersen","doi":"10.1097/MAO.0000000000004346","DOIUrl":null,"url":null,"abstract":"Objective: Assessment is key in modern surgical education to monitor progress and document sufficient skills. Virtual reality (VR) temporal bone simulators allow automated tracking of basic metrics such as time, volume removed, and collisions. However, adequate performance assessment further includes compound rating of the stepwise bony excavation, and exposure and preservation of soft tissue structures. Such complex assessment requires further development of automated assessment routines in the VR simulation environment. In this study, we present the integration of automated mastoidectomy final-product assessment with validation against manual rating.Methods: At two international temporal bone courses, 33 ORL trainees performed anatomical mastoidectomies in the Visible Ear (VR) Simulator with automatic performance assessment using a newly implemented rating routine based on the modified Welling Scale. Automated assessment was compared with manual ratings by experts using absolute agreement, intraclass correlation, and generalizability analysis to establish validity and reliability.Results: The overall average agreement between manual and automatic assessment was 83.9% compared with the inter-rater agreement of 88.9%. A majority of items (15 out of 26) showed high agreement between automated and manual rating (>85%). Intraclass correlation coefficients were found to be high. Generalizability analysis with D-studies found that five repetitions per participant are needed for a G coefficient >0.8, which is considered necessary for high-stakes assessments.Conclusion: We have demonstrated the feasibility, validity, and reliability of an automatic assessment system integrated into a VR temporal bone simulator. This can prove to be an important tool for future self-directed training with skills certification.","PeriodicalId":19732,"journal":{"name":"Otology & Neurotology","volume":" ","pages":"96-103"},"PeriodicalIF":1.9000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Otology & Neurotology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/MAO.0000000000004346","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/6 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: Assessment is key in modern surgical education to monitor progress and document sufficient skills. Virtual reality (VR) temporal bone simulators allow automated tracking of basic metrics such as time, volume removed, and collisions. However, adequate performance assessment further includes compound rating of the stepwise bony excavation, and exposure and preservation of soft tissue structures. Such complex assessment requires further development of automated assessment routines in the VR simulation environment. In this study, we present the integration of automated mastoidectomy final-product assessment with validation against manual rating.

Methods: At two international temporal bone courses, 33 ORL trainees performed anatomical mastoidectomies in the Visible Ear (VR) Simulator with automatic performance assessment using a newly implemented rating routine based on the modified Welling Scale. Automated assessment was compared with manual ratings by experts using absolute agreement, intraclass correlation, and generalizability analysis to establish validity and reliability.

Results: The overall average agreement between manual and automatic assessment was 83.9% compared with the inter-rater agreement of 88.9%. A majority of items (15 out of 26) showed high agreement between automated and manual rating (>85%). Intraclass correlation coefficients were found to be high. Generalizability analysis with D-studies found that five repetitions per participant are needed for a G coefficient >0.8, which is considered necessary for high-stakes assessments.

Conclusion: We have demonstrated the feasibility, validity, and reliability of an automatic assessment system integrated into a VR temporal bone simulator. This can prove to be an important tool for future self-directed training with skills certification.

查看原文本刊更多论文

虚拟现实乳突切除术性能的最终产品自动评估：有效性和可靠性研究。

目的：评估是现代外科教育的关键，可用于监测进展和记录足够的技能。虚拟现实（VR）颞骨模拟器可自动跟踪时间、切除量和碰撞等基本指标。然而，充分的绩效评估还包括对逐步骨质挖掘、软组织结构的暴露和保存进行复合评分。这种复杂的评估需要在 VR 模拟环境中进一步开发自动评估程序。在本研究中，我们介绍了乳突切除术最终产品自动评估的整合，并与人工评分进行了验证：方法：在两个国际颞骨课程中，33 名 ORL 学员在可视耳 (VR) 模拟器中进行了解剖乳突切除术，并使用基于改良威灵量表新实施的评分程序进行了自动绩效评估。使用绝对一致性、类内相关性和可推广性分析将自动评估与专家手动评分进行比较，以确定有效性和可靠性：结果：人工评估与自动评估的总体平均一致性为 83.9%，而评分者之间的一致性为 88.9%。大部分项目（26 项中的 15 项）的自动评分与人工评分的一致性较高（>85%）。类内相关系数也很高。利用 D 研究进行的普适性分析发现，每位参与者需要重复五次才能使 G 系数大于 0.8，这对于高风险评估来说是必要的：我们已经证明了集成到 VR颞骨模拟器中的自动评估系统的可行性、有效性和可靠性。这将被证明是未来进行技能认证的自主培训的重要工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Otology & Neurotology 医学-耳鼻喉科学

CiteScore

3.80

自引率

14.30%

发文量

509

审稿时长

3-6 weeks

期刊介绍： Otology & Neurotology publishes original articles relating to both clinical and basic science aspects of otology, neurotology, and cranial base surgery. As the foremost journal in its field, it has become the favored place for publishing the best of new science relating to the human ear and its diseases. The broadly international character of its contributing authors, editorial board, and readership provides the Journal its decidedly global perspective.