基于音频的手术室事件检测

IF 2.3 3区医学 Q3 ENGINEERING, BIOMEDICAL

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2024-12-01 Epub Date: 2024-06-11 DOI:10.1007/s11548-024-03211-1

Jonas Fuchtmann, Thomas Riedel, Maximilian Berlet, Alissa Jell, Luca Wegener, Lars Wagner, Simone Graf, Dirk Wilhelm, Daniel Ostler-Mildner

{"title":"基于音频的手术室事件检测","authors":"Jonas Fuchtmann, Thomas Riedel, Maximilian Berlet, Alissa Jell, Luca Wegener, Lars Wagner, Simone Graf, Dirk Wilhelm, Daniel Ostler-Mildner","doi":"10.1007/s11548-024-03211-1","DOIUrl":null,"url":null,"abstract":"Purpose: Even though workflow analysis in the operating room has come a long way, current systems are still limited to research. In the quest for a robust, universal setup, hardly any attention has been given to the dimension of audio despite its numerous advantages, such as low costs, location, and sight independence, or little required processing power.Methodology: We present an approach for audio-based event detection that solely relies on two microphones capturing the sound in the operating room. Therefore, a new data set was created with over 63 h of audio recorded and annotated at the University Hospital rechts der Isar. Sound files were labeled, preprocessed, augmented, and subsequently converted to log-mel-spectrograms that served as a visual input for an event classification using pretrained convolutional neural networks.Results: Comparing multiple architectures, we were able to show that even lightweight models, such as MobileNet, can already provide promising results. Data augmentation additionally improved the classification of 11 defined classes, including inter alia different types of coagulation, operating table movements as well as an idle class. With the newly created audio data set, an overall accuracy of 90%, a precision of 91% and a F1-score of 91% were achieved, demonstrating the feasibility of an audio-based event recognition in the operating room.Conclusion: With this first proof of concept, we demonstrated that audio events can serve as a meaningful source of information that goes beyond spoken language and can easily be integrated into future workflow recognition pipelines using computational inexpensive architectures.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2381-2387"},"PeriodicalIF":2.3000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11607025/pdf/","citationCount":"0","resultStr":"{\"title\":\"Audio-based event detection in the operating room.\",\"authors\":\"Jonas Fuchtmann, Thomas Riedel, Maximilian Berlet, Alissa Jell, Luca Wegener, Lars Wagner, Simone Graf, Dirk Wilhelm, Daniel Ostler-Mildner\",\"doi\":\"10.1007/s11548-024-03211-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: Even though workflow analysis in the operating room has come a long way, current systems are still limited to research. In the quest for a robust, universal setup, hardly any attention has been given to the dimension of audio despite its numerous advantages, such as low costs, location, and sight independence, or little required processing power.Methodology: We present an approach for audio-based event detection that solely relies on two microphones capturing the sound in the operating room. Therefore, a new data set was created with over 63 h of audio recorded and annotated at the University Hospital rechts der Isar. Sound files were labeled, preprocessed, augmented, and subsequently converted to log-mel-spectrograms that served as a visual input for an event classification using pretrained convolutional neural networks.Results: Comparing multiple architectures, we were able to show that even lightweight models, such as MobileNet, can already provide promising results. Data augmentation additionally improved the classification of 11 defined classes, including inter alia different types of coagulation, operating table movements as well as an idle class. With the newly created audio data set, an overall accuracy of 90%, a precision of 91% and a F1-score of 91% were achieved, demonstrating the feasibility of an audio-based event recognition in the operating room.Conclusion: With this first proof of concept, we demonstrated that audio events can serve as a meaningful source of information that goes beyond spoken language and can easily be integrated into future workflow recognition pipelines using computational inexpensive architectures.\",\"PeriodicalId\":51251,\"journal\":{\"name\":\"International Journal of Computer Assisted Radiology and Surgery\",\"volume\":\" \",\"pages\":\"2381-2387\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11607025/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Assisted Radiology and Surgery\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s11548-024-03211-1\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/6/11 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-024-03211-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/11 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

目的：尽管手术室的工作流程分析已经取得了长足的进步，但目前的系统仍局限于研究。尽管音频具有成本低、不受地点和视线限制、所需处理能力小等诸多优势，但在寻求一种强大、通用的设置时，几乎没有人注意到音频方面的问题：我们提出了一种基于音频的事件检测方法，该方法仅依靠两个麦克风捕捉手术室内的声音。因此，我们在伊萨尔河畔大学医院创建了一个新的数据集，其中包含超过 63 小时的音频记录和注释。声音文件经过标注、预处理、增强，随后被转换为对数-麦尔谱图，作为使用预训练卷积神经网络进行事件分类的可视化输入：通过对多种架构的比较，我们发现即使是轻量级模型，如 MobileNet，也能提供令人满意的结果。数据扩增进一步改进了 11 个已定义类别的分类，其中包括不同类型的凝血、手术台移动以及空闲类别。通过新创建的音频数据集，总体准确率达到 90%，精确率达到 91%，F1 分数达到 91%，证明了在手术室中基于音频的事件识别的可行性：通过首次概念验证，我们证明了音频事件可以作为一种有意义的信息来源，它超越了口语的范畴，可以利用计算廉价架构轻松集成到未来的工作流识别管道中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Audio-based event detection in the operating room.

查看原文本刊更多论文

Audio-based event detection in the operating room.

Purpose: Even though workflow analysis in the operating room has come a long way, current systems are still limited to research. In the quest for a robust, universal setup, hardly any attention has been given to the dimension of audio despite its numerous advantages, such as low costs, location, and sight independence, or little required processing power.

Methodology: We present an approach for audio-based event detection that solely relies on two microphones capturing the sound in the operating room. Therefore, a new data set was created with over 63 h of audio recorded and annotated at the University Hospital rechts der Isar. Sound files were labeled, preprocessed, augmented, and subsequently converted to log-mel-spectrograms that served as a visual input for an event classification using pretrained convolutional neural networks.

Results: Comparing multiple architectures, we were able to show that even lightweight models, such as MobileNet, can already provide promising results. Data augmentation additionally improved the classification of 11 defined classes, including inter alia different types of coagulation, operating table movements as well as an idle class. With the newly created audio data set, an overall accuracy of 90%, a precision of 91% and a F1-score of 91% were achieved, demonstrating the feasibility of an audio-based event recognition in the operating room.

Conclusion: With this first proof of concept, we demonstrated that audio events can serve as a meaningful source of information that goes beyond spoken language and can easily be integrated into future workflow recognition pipelines using computational inexpensive architectures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Computer Assisted Radiology and Surgery ENGINEERING, BIOMEDICAL-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

CiteScore

5.90

自引率

6.70%

发文量

243

审稿时长

6-12 weeks

期刊介绍： The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.