Multi-View Representation Learning for Multi-Instance Learning with Applications to Medical Image Classification

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) Pub Date : 2022-12-06 DOI:10.1109/BIBM55620.2022.9995079

Lu Zhao, Liming Yuan, Zhenliang Li, Xianbin Wen

{"title":"Multi-View Representation Learning for Multi-Instance Learning with Applications to Medical Image Classification","authors":"Lu Zhao, Liming Yuan, Zhenliang Li, Xianbin Wen","doi":"10.1109/BIBM55620.2022.9995079","DOIUrl":null,"url":null,"abstract":"Multi-Instance Learning (MIL) is a weakly supervised learning paradigm, in which every training example is a labeled bag of unlabeled instances. In typical MIL applications, instances are often used for describing the features of regions/parts in a whole object, e.g., regional patches/lesions in an eye-fundus image. However, for a (semantically) complex part the standard MIL formulation puts a heavy burden on the representation ability of the corresponding instance. To alleviate this pressure, we still adopt a bag-of-instances as an example in this paper, but extract from each instance a set of representations using $1 \\times1$ convolutions. The advantages of this tactic are two-fold: i) This set of representations can be regarded as multi-view representations for an instance; ii) Compared to building multi-view representations directly from scratch, extracting them automatically using $1 \\times1$ convolutions is more economical, and may be more effective since $1 \\times1$ convolutions can be embedded into the whole network. Furthermore, we apply two consecutive multi-instance pooling operations on the reconstituted bag that has actually become a bag of sets of multi-view representations. We have conducted extensive experiments on several canonical MIL data sets from different application domains. The experimental results show that the proposed framework outperforms the standard MIL formulation in terms of classification performance and has good interpretability.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"430 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM55620.2022.9995079","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Multi-Instance Learning (MIL) is a weakly supervised learning paradigm, in which every training example is a labeled bag of unlabeled instances. In typical MIL applications, instances are often used for describing the features of regions/parts in a whole object, e.g., regional patches/lesions in an eye-fundus image. However, for a (semantically) complex part the standard MIL formulation puts a heavy burden on the representation ability of the corresponding instance. To alleviate this pressure, we still adopt a bag-of-instances as an example in this paper, but extract from each instance a set of representations using $1 \times1$ convolutions. The advantages of this tactic are two-fold: i) This set of representations can be regarded as multi-view representations for an instance; ii) Compared to building multi-view representations directly from scratch, extracting them automatically using $1 \times1$ convolutions is more economical, and may be more effective since $1 \times1$ convolutions can be embedded into the whole network. Furthermore, we apply two consecutive multi-instance pooling operations on the reconstituted bag that has actually become a bag of sets of multi-view representations. We have conducted extensive experiments on several canonical MIL data sets from different application domains. The experimental results show that the proposed framework outperforms the standard MIL formulation in terms of classification performance and has good interpretability.

查看原文本刊更多论文

多视图表示学习在医学图像分类中的应用

多实例学习(Multi-Instance Learning, MIL)是一种弱监督学习范式，其中每个训练样例都是未标记实例的标记袋。在典型的MIL应用中，实例通常用于描述整个物体中区域/部分的特征，例如眼底图像中的区域斑块/病变。然而，对于(语义)复杂的部件，标准的MIL公式给相应实例的表示能力带来了沉重的负担。为了减轻这种压力，我们在本文中仍然采用实例袋作为示例，但是从每个实例中提取一组使用$1 \times1$卷积的表示。这种策略的优点是双重的:i)这组表示可以被视为一个实例的多视图表示;ii)与直接从头开始构建多视图表示相比，使用$1 \times1$卷积自动提取它们更经济，并且可能更有效，因为$1 \times1$卷积可以嵌入到整个网络中。此外，我们在重构后的包上应用了两个连续的多实例池化操作，这个包实际上已经变成了一个多视图表示集的包。我们对来自不同应用领域的几个规范MIL数据集进行了广泛的实验。实验结果表明，该框架在分类性能上优于标准MIL公式，具有良好的可解释性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

自引率

0.00%

发文量