Localizing and Recognizing Labels for Multi-Panel Figures in Biomedical Journals

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI:10.1109/ICDAR.2017.128

Jie Zou, Sameer Kiran Antani, G. Thoma

{"title":"Localizing and Recognizing Labels for Multi-Panel Figures in Biomedical Journals","authors":"Jie Zou, Sameer Kiran Antani, G. Thoma","doi":"10.1109/ICDAR.2017.128","DOIUrl":null,"url":null,"abstract":"Multi-panel figures are common in biomedical journals. Often the subpanels are of different types, e.g. x-ray, microscopy, sketch, etc. Visual information retrieval of such figures can significantly benefit from Panel Label Recognition techniques that index figures for search engines, image content tagging, and correlating with figure (sub)captions. It is a challenging task due to large variation in the label locations, sizes, contrast to background, etc. In this work, we propose a 3-stage recognition algorithm. The first stage is formulated as object detection, where we extract Histograms of Oriented Gradient (HOG) features and train a linear Support Vector Machine (SVM) classifier. Label candidates are detected using sliding windows at different locations and scales. We also trained a convolutional deep neural network (CNN) to remove false positives. The second stage is formulated as image classification. We trained a 50-class RBF SVM classifier and estimate the posterior probabilities of each candidate label. The last stage is formulated as sequence classification. We used a beam search algorithm on the posterior probabilities estimated in the second stage along with a set of label sequence constraints to select an optimal label sequence. The algorithm is trained on 9,642 figures, and evaluated on the remaining 1,000 figures shows that the proposed algorithm achieves good precision and recall.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2017.128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Multi-panel figures are common in biomedical journals. Often the subpanels are of different types, e.g. x-ray, microscopy, sketch, etc. Visual information retrieval of such figures can significantly benefit from Panel Label Recognition techniques that index figures for search engines, image content tagging, and correlating with figure (sub)captions. It is a challenging task due to large variation in the label locations, sizes, contrast to background, etc. In this work, we propose a 3-stage recognition algorithm. The first stage is formulated as object detection, where we extract Histograms of Oriented Gradient (HOG) features and train a linear Support Vector Machine (SVM) classifier. Label candidates are detected using sliding windows at different locations and scales. We also trained a convolutional deep neural network (CNN) to remove false positives. The second stage is formulated as image classification. We trained a 50-class RBF SVM classifier and estimate the posterior probabilities of each candidate label. The last stage is formulated as sequence classification. We used a beam search algorithm on the posterior probabilities estimated in the second stage along with a set of label sequence constraints to select an optimal label sequence. The algorithm is trained on 9,642 figures, and evaluated on the remaining 1,000 figures shows that the proposed algorithm achieves good precision and recall.

查看原文本刊更多论文

生物医学期刊中多版面图形标签的定位与识别

多面板图在生物医学期刊中很常见。子面板通常是不同类型的，例如x射线、显微镜、素描等。这些图形的视觉信息检索可以显著受益于面板标签识别技术，该技术为搜索引擎索引图形、图像内容标记和与图形(子)标题相关联。这是一项具有挑战性的任务，因为标签的位置、大小、背景对比度等都有很大的变化。在这项工作中，我们提出了一种三阶段识别算法。第一阶段为目标检测，提取定向梯度直方图(HOG)特征并训练线性支持向量机(SVM)分类器。使用不同位置和尺度的滑动窗口检测候选标签。我们还训练了一个卷积深度神经网络(CNN)来去除误报。第二阶段为图像分类。我们训练了一个50类RBF SVM分类器，并估计了每个候选标签的后验概率。最后一个阶段是序列分类。我们在第二阶段估计的后验概率基础上，结合一组标签序列约束，使用波束搜索算法来选择最优标签序列。该算法在9642幅图上进行了训练，对剩余的1000幅图进行了评估，结果表明该算法具有良好的查准率和查全率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)

自引率

0.00%

发文量