Synthesising Rare Cataract Surgery Samples with Guided Diffusion Models

Yannik Frisch, Moritz Fuchs, Antoine Pierre Sanner, F. A. Ucar, Marius Frenzel, Joana Wasielica-Poslednik, A. Gericke, F. Wagner, Thomas Dratsch, A. Mukhopadhyay
{"title":"Synthesising Rare Cataract Surgery Samples with Guided Diffusion Models","authors":"Yannik Frisch, Moritz Fuchs, Antoine Pierre Sanner, F. A. Ucar, Marius Frenzel, Joana Wasielica-Poslednik, A. Gericke, F. Wagner, Thomas Dratsch, A. Mukhopadhyay","doi":"10.48550/arXiv.2308.02587","DOIUrl":null,"url":null,"abstract":"Cataract surgery is a frequently performed procedure that demands automation and advanced assistance systems. However, gathering and annotating data for training such systems is resource intensive. The publicly available data also comprises severe imbalances inherent to the surgical process. Motivated by this, we analyse cataract surgery video data for the worst-performing phases of a pre-trained downstream tool classifier. The analysis demonstrates that imbalances deteriorate the classifier's performance on underrepresented cases. To address this challenge, we utilise a conditional generative model based on Denoising Diffusion Implicit Models (DDIM) and Classifier-Free Guidance (CFG). Our model can synthesise diverse, high-quality examples based on complex multi-class multi-label conditions, such as surgical phases and combinations of surgical tools. We affirm that the synthesised samples display tools that the classifier recognises. These samples are hard to differentiate from real images, even for clinical experts with more than five years of experience. Further, our synthetically extended data can improve the data sparsity problem for the downstream task of tool classification. The evaluations demonstrate that the model can generate valuable unseen examples, allowing the tool classifier to improve by up to 10% for rare cases. Overall, our approach can facilitate the development of automated assistance systems for cataract surgery by providing a reliable source of realistic synthetic data, which we make available for everyone.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"305 1","pages":"354-364"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2308.02587","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Cataract surgery is a frequently performed procedure that demands automation and advanced assistance systems. However, gathering and annotating data for training such systems is resource intensive. The publicly available data also comprises severe imbalances inherent to the surgical process. Motivated by this, we analyse cataract surgery video data for the worst-performing phases of a pre-trained downstream tool classifier. The analysis demonstrates that imbalances deteriorate the classifier's performance on underrepresented cases. To address this challenge, we utilise a conditional generative model based on Denoising Diffusion Implicit Models (DDIM) and Classifier-Free Guidance (CFG). Our model can synthesise diverse, high-quality examples based on complex multi-class multi-label conditions, such as surgical phases and combinations of surgical tools. We affirm that the synthesised samples display tools that the classifier recognises. These samples are hard to differentiate from real images, even for clinical experts with more than five years of experience. Further, our synthetically extended data can improve the data sparsity problem for the downstream task of tool classification. The evaluations demonstrate that the model can generate valuable unseen examples, allowing the tool classifier to improve by up to 10% for rare cases. Overall, our approach can facilitate the development of automated assistance systems for cataract surgery by providing a reliable source of realistic synthetic data, which we make available for everyone.
利用引导扩散模型合成罕见白内障手术样本
白内障手术是一种经常进行的手术,需要自动化和先进的辅助系统。然而,为训练这样的系统收集和注释数据是资源密集型的。公开可用的数据还包括手术过程中固有的严重不平衡。基于此,我们分析了白内障手术视频数据中预训练的下游工具分类器中表现最差的阶段。分析表明,不平衡会使分类器在代表性不足的情况下的性能下降。为了解决这一挑战,我们利用了基于去噪扩散隐式模型(DDIM)和无分类器指导(CFG)的条件生成模型。我们的模型可以基于复杂的多类别多标签条件(如手术阶段和手术工具的组合)合成各种高质量的示例。我们确认合成的样本显示了分类器识别的工具。这些样本很难与真实图像区分开来,即使是有五年以上经验的临床专家。此外,我们的综合扩展数据可以改善下游工具分类任务的数据稀疏性问题。评估表明,该模型可以生成有价值的未见过的示例,允许工具分类器在罕见情况下提高高达10%。总的来说,我们的方法可以通过提供可靠的真实合成数据来源来促进白内障手术自动辅助系统的发展,我们让每个人都可以使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信