用于可靠分类的不确定性感知多实例学习：应用于光学相干断层扫描。

IF 10.7 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2024-06-27 DOI:10.1016/j.media.2024.103259

Coen de Vente , Bram van Ginneken , Carel B. Hoyng , Caroline C.W. Klaver , Clara I. Sánchez

{"title":"用于可靠分类的不确定性感知多实例学习：应用于光学相干断层扫描。","authors":"Coen de Vente , Bram van Ginneken , Carel B. Hoyng , Caroline C.W. Klaver , Clara I. Sánchez","doi":"10.1016/j.media.2024.103259","DOIUrl":null,"url":null,"abstract":"<div><p>Deep learning classification models for medical image analysis often perform well on data from scanners that were used to acquire the training data. However, when these models are applied to data from different vendors, their performance tends to drop substantially. Artifacts that only occur within scans from specific scanners are major causes of this poor generalizability. We aimed to enhance the reliability of deep learning classification models using a novel method called Uncertainty-Based Instance eXclusion (UBIX). UBIX is an <em>inference-time</em> module that can be employed in multiple-instance learning (MIL) settings. MIL is a paradigm in which instances (generally crops or slices) of a bag (generally an image) contribute towards a bag-level output. Instead of assuming equal contribution of all instances to the bag-level output, UBIX detects instances corrupted due to local artifacts on-the-fly using uncertainty estimation, reducing or fully ignoring their contributions before MIL pooling. In our experiments, instances are 2D slices and bags are volumetric images, but alternative definitions are also possible. Although UBIX is generally applicable to diverse classification tasks, we focused on the staging of age-related macular degeneration in optical coherence tomography. Our models were trained on data from a single scanner and tested on external datasets from different vendors, which included vendor-specific artifacts. UBIX showed reliable behavior, with a slight decrease in performance (a decrease of the quadratic weighted kappa (<span><math><msub><mrow><mi>κ</mi></mrow><mrow><mi>w</mi></mrow></msub></math></span>) from 0.861 to 0.708), when applied to images from different vendors containing artifacts; while a state-of-the-art 3D neural network without UBIX suffered from a significant detriment of performance (<span><math><msub><mrow><mi>κ</mi></mrow><mrow><mi>w</mi></mrow></msub></math></span> from 0.852 to 0.084) on the same test set. We showed that instances with unseen artifacts can be identified with OOD detection. UBIX can reduce their contribution to the bag-level predictions, improving reliability without retraining on new data. This potentially increases the applicability of artificial intelligence models to data from other scanners than the ones for which they were developed. The source code for UBIX, including trained model weights, is publicly available through <span>https://github.com/qurAI-amsterdam/ubix-for-reliable-classification</span><svg><path></path></svg>.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"97 ","pages":"Article 103259"},"PeriodicalIF":10.7000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1361841524001841/pdfft?md5=1a9370d2ce8b871759f9c0782e38b0b9&pid=1-s2.0-S1361841524001841-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Uncertainty-aware multiple-instance learning for reliable classification: Application to optical coherence tomography\",\"authors\":\"Coen de Vente , Bram van Ginneken , Carel B. Hoyng , Caroline C.W. Klaver , Clara I. Sánchez\",\"doi\":\"10.1016/j.media.2024.103259\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Deep learning classification models for medical image analysis often perform well on data from scanners that were used to acquire the training data. However, when these models are applied to data from different vendors, their performance tends to drop substantially. Artifacts that only occur within scans from specific scanners are major causes of this poor generalizability. We aimed to enhance the reliability of deep learning classification models using a novel method called Uncertainty-Based Instance eXclusion (UBIX). UBIX is an <em>inference-time</em> module that can be employed in multiple-instance learning (MIL) settings. MIL is a paradigm in which instances (generally crops or slices) of a bag (generally an image) contribute towards a bag-level output. Instead of assuming equal contribution of all instances to the bag-level output, UBIX detects instances corrupted due to local artifacts on-the-fly using uncertainty estimation, reducing or fully ignoring their contributions before MIL pooling. In our experiments, instances are 2D slices and bags are volumetric images, but alternative definitions are also possible. Although UBIX is generally applicable to diverse classification tasks, we focused on the staging of age-related macular degeneration in optical coherence tomography. Our models were trained on data from a single scanner and tested on external datasets from different vendors, which included vendor-specific artifacts. UBIX showed reliable behavior, with a slight decrease in performance (a decrease of the quadratic weighted kappa (<span><math><msub><mrow><mi>κ</mi></mrow><mrow><mi>w</mi></mrow></msub></math></span>) from 0.861 to 0.708), when applied to images from different vendors containing artifacts; while a state-of-the-art 3D neural network without UBIX suffered from a significant detriment of performance (<span><math><msub><mrow><mi>κ</mi></mrow><mrow><mi>w</mi></mrow></msub></math></span> from 0.852 to 0.084) on the same test set. We showed that instances with unseen artifacts can be identified with OOD detection. UBIX can reduce their contribution to the bag-level predictions, improving reliability without retraining on new data. This potentially increases the applicability of artificial intelligence models to data from other scanners than the ones for which they were developed. The source code for UBIX, including trained model weights, is publicly available through <span>https://github.com/qurAI-amsterdam/ubix-for-reliable-classification</span><svg><path></path></svg>.</p></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"97 \",\"pages\":\"Article 103259\"},\"PeriodicalIF\":10.7000,\"publicationDate\":\"2024-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1361841524001841/pdfft?md5=1a9370d2ce8b871759f9c0782e38b0b9&pid=1-s2.0-S1361841524001841-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1361841524001841\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841524001841","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

用于医学图像分析的深度学习分类模型通常在用于获取训练数据的扫描仪数据上表现良好。然而，当这些模型应用于来自不同供应商的数据时，其性能往往会大幅下降。仅在特定扫描仪的扫描中出现的伪影是造成这种通用性差的主要原因。我们的目标是使用一种名为 "基于不确定性的实例排除（UBIX）"的新方法来提高深度学习分类模型的可靠性。UBIX 是一种推理时模块，可用于多实例学习（MIL）环境。多实例学习（MIL）是一种模式，其中一个包（通常是图像）的实例（一般是作物或切片）对包级输出有贡献。UBIX 并不假定所有实例对包级输出的贡献相等，而是利用不确定性估算即时检测因局部伪影而损坏的实例，在 MIL 汇集之前减少或完全忽略它们的贡献。在我们的实验中，实例是二维切片，包是体积图像，但也可以采用其他定义。虽然 UBIX 通常适用于各种分类任务，但我们重点研究了光学相干断层扫描中年龄相关性黄斑变性的分期。我们用一台扫描仪的数据对模型进行了训练，并用来自不同供应商的外部数据集进行了测试，其中包括供应商特定的伪影。UBIX 显示了可靠的行为，在应用于来自不同供应商的包含伪影的图像时，性能略有下降（二次加权卡帕（κw）从 0.861 降至 0.708）；而在同一测试集上，不使用 UBIX 的最先进 3D 神经网络的性能显著下降（κw 从 0.852 降至 0.084）。我们的研究表明，通过 OOD 检测可以识别出具有未见伪影的实例。UBIX 可以减少它们对数据包级预测的贡献，从而提高可靠性，而无需对新数据进行再训练。这就有可能提高人工智能模型对其他扫描仪数据的适用性，而不是针对其开发的扫描仪。UBIX 的源代码（包括训练好的模型权重）可通过 https://github.com/qurAI-amsterdam/ubix-for-reliable-classification 公开获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Uncertainty-aware multiple-instance learning for reliable classification: Application to optical coherence tomography

Deep learning classification models for medical image analysis often perform well on data from scanners that were used to acquire the training data. However, when these models are applied to data from different vendors, their performance tends to drop substantially. Artifacts that only occur within scans from specific scanners are major causes of this poor generalizability. We aimed to enhance the reliability of deep learning classification models using a novel method called Uncertainty-Based Instance eXclusion (UBIX). UBIX is an inference-time module that can be employed in multiple-instance learning (MIL) settings. MIL is a paradigm in which instances (generally crops or slices) of a bag (generally an image) contribute towards a bag-level output. Instead of assuming equal contribution of all instances to the bag-level output, UBIX detects instances corrupted due to local artifacts on-the-fly using uncertainty estimation, reducing or fully ignoring their contributions before MIL pooling. In our experiments, instances are 2D slices and bags are volumetric images, but alternative definitions are also possible. Although UBIX is generally applicable to diverse classification tasks, we focused on the staging of age-related macular degeneration in optical coherence tomography. Our models were trained on data from a single scanner and tested on external datasets from different vendors, which included vendor-specific artifacts. UBIX showed reliable behavior, with a slight decrease in performance (a decrease of the quadratic weighted kappa ( $κ_{w}$ ) from 0.861 to 0.708), when applied to images from different vendors containing artifacts; while a state-of-the-art 3D neural network without UBIX suffered from a significant detriment of performance ( $κ_{w}$ from 0.852 to 0.084) on the same test set. We showed that instances with unseen artifacts can be identified with OOD detection. UBIX can reduce their contribution to the bag-level predictions, improving reliability without retraining on new data. This potentially increases the applicability of artificial intelligence models to data from other scanners than the ones for which they were developed. The source code for UBIX, including trained model weights, is publicly available through https://github.com/qurAI-amsterdam/ubix-for-reliable-classification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.