Label the many with a few: Semi-automatic medical image modality discovery in a large image collection

2014 IEEE Symposium on Computational Intelligence in Healthcare and e-health (CICARE) Pub Date : 2014-12-01 DOI:10.1109/CICARE.2014.7007850

Szilárd Vajda, D. You, Sameer Kiran Antani, G. Thoma

{"title":"Label the many with a few: Semi-automatic medical image modality discovery in a large image collection","authors":"Szilárd Vajda, D. You, Sameer Kiran Antani, G. Thoma","doi":"10.1109/CICARE.2014.7007850","DOIUrl":null,"url":null,"abstract":"In this paper we present a fast and effective method for labeling images in a large image collection. Image modality detection has been of research interest for querying multimodal medical documents. To accurately predict the different image modalities using complex visual and textual features, we need advanced classification schemes with supervised learning mechanisms and accurate training labels. Our proposed method, on the other hand, uses a multiview-approach and requires minimal expert knowledge to semi-automatically label the images. The images are first projected in different feature spaces, and are then clustered in an unsupervised manner. Only the cluster representative images are labeled by an expert. Other images from the cluster “inherit” the labels from these cluster representatives. The final label assigned to each image is based on a voting mechanism, where each vote is derived from different feature space clustering. Through experiments we show that using only 0.3% of the labels was sufficient to annotate 300,000 medical images with 49.95% accuracy. Although, automatic labeling is not as precise as manual, it saves approximately 700 hours of manual expert labeling, and may be sufficient for next-stage classifier training. We find that for this collection accuracy improvements are feasible with better disparate feature selection or different filtering mechanisms.","PeriodicalId":120730,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence in Healthcare and e-health (CICARE)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Symposium on Computational Intelligence in Healthcare and e-health (CICARE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CICARE.2014.7007850","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

In this paper we present a fast and effective method for labeling images in a large image collection. Image modality detection has been of research interest for querying multimodal medical documents. To accurately predict the different image modalities using complex visual and textual features, we need advanced classification schemes with supervised learning mechanisms and accurate training labels. Our proposed method, on the other hand, uses a multiview-approach and requires minimal expert knowledge to semi-automatically label the images. The images are first projected in different feature spaces, and are then clustered in an unsupervised manner. Only the cluster representative images are labeled by an expert. Other images from the cluster “inherit” the labels from these cluster representatives. The final label assigned to each image is based on a voting mechanism, where each vote is derived from different feature space clustering. Through experiments we show that using only 0.3% of the labels was sufficient to annotate 300,000 medical images with 49.95% accuracy. Although, automatic labeling is not as precise as manual, it saves approximately 700 hours of manual expert labeling, and may be sufficient for next-stage classifier training. We find that for this collection accuracy improvements are feasible with better disparate feature selection or different filtering mechanisms.

查看原文本刊更多论文

用少数标记多:在大型图像集合中半自动医学图像模态发现

本文提出了一种快速有效的图像标记方法。图像模态检测一直是查询多模态医学文献的研究热点。为了使用复杂的视觉和文本特征准确地预测不同的图像模式，我们需要具有监督学习机制和准确训练标签的高级分类方案。另一方面，我们提出的方法使用多视图方法，并且需要最少的专家知识来半自动标记图像。首先将图像投影到不同的特征空间中，然后以无监督的方式聚类。专家只对具有代表性的图像进行标记。来自集群的其他图像“继承”来自这些集群代表的标签。每个图像的最终标签分配基于投票机制，其中每个投票来自不同的特征空间聚类。通过实验表明，仅使用0.3%的标签就足以标注30万张医学图像，准确率为49.95%。虽然自动标注不像人工标注那么精确，但它节省了大约700小时的人工专家标注，并且可能足以用于下一阶段的分类器训练。我们发现，通过更好的不同特征选择或不同的过滤机制，可以提高收集的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 IEEE Symposium on Computational Intelligence in Healthcare and e-health (CICARE)

自引率

0.00%

发文量