文档图像中元数据自动发现的无监督方法

2016 Fifteenth Mexican International Conference on Artificial Intelligence (MICAI) Pub Date : 2016-10-01 DOI:10.1109/MICAI-2016.2016.00009

Carlos Morales-Solares, Gerardo E Sierra, B. Escalante-Ramírez

{"title":"文档图像中元数据自动发现的无监督方法","authors":"Carlos Morales-Solares, Gerardo E Sierra, B. Escalante-Ramírez","doi":"10.1109/MICAI-2016.2016.00009","DOIUrl":null,"url":null,"abstract":"The visual information contained in documents provides a rich set of features that can be exploited to increase its understanding. The typography, design or lexical properties of text constitute the clues that help us identify at a glance those data from other. In this paper, we present a methodology to identify, extract and automatically classify the metadata of the document covers. A problem associated with metadata discovery is the processing of the original document format. We propose the combination of two methods, maximally stable extremal regions (MSER) for detecting text in cover images with complex background, and conditional random fields (CRF) for logical labeling elements in the document. We show a selected set of visual and linguistic features used to train our model. As a necessary proof of concept we incorporated the methods in a desktop application and we executed some interesting examples. Preliminary results show a performance improvement in text recognition regarding traditional methods of metadata extraction for document images. In particular, a problem that we seek to solve is the ambiguity between the book title and the author.","PeriodicalId":405503,"journal":{"name":"2016 Fifteenth Mexican International Conference on Artificial Intelligence (MICAI)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Unsupervised Approach for Automatic Discovery of Metadata in Document Images\",\"authors\":\"Carlos Morales-Solares, Gerardo E Sierra, B. Escalante-Ramírez\",\"doi\":\"10.1109/MICAI-2016.2016.00009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The visual information contained in documents provides a rich set of features that can be exploited to increase its understanding. The typography, design or lexical properties of text constitute the clues that help us identify at a glance those data from other. In this paper, we present a methodology to identify, extract and automatically classify the metadata of the document covers. A problem associated with metadata discovery is the processing of the original document format. We propose the combination of two methods, maximally stable extremal regions (MSER) for detecting text in cover images with complex background, and conditional random fields (CRF) for logical labeling elements in the document. We show a selected set of visual and linguistic features used to train our model. As a necessary proof of concept we incorporated the methods in a desktop application and we executed some interesting examples. Preliminary results show a performance improvement in text recognition regarding traditional methods of metadata extraction for document images. In particular, a problem that we seek to solve is the ambiguity between the book title and the author.\",\"PeriodicalId\":405503,\"journal\":{\"name\":\"2016 Fifteenth Mexican International Conference on Artificial Intelligence (MICAI)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Fifteenth Mexican International Conference on Artificial Intelligence (MICAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MICAI-2016.2016.00009\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Fifteenth Mexican International Conference on Artificial Intelligence (MICAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MICAI-2016.2016.00009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

文档中包含的可视化信息提供了一组丰富的特性，可以利用这些特性来增加对文档的理解。文本的排版、设计或词汇属性构成线索，帮助我们一眼就能从其他数据中识别出这些数据。在本文中，我们提出了一种识别、提取和自动分类文档封面元数据的方法。与元数据发现相关的一个问题是原始文档格式的处理。我们提出了两种方法的结合，最大稳定极值区域(MSER)用于检测具有复杂背景的封面图像中的文本，条件随机场(CRF)用于文档中的逻辑标记元素。我们展示了一组选定的视觉和语言特征，用于训练我们的模型。作为必要的概念证明，我们将这些方法合并到一个桌面应用程序中，并执行了一些有趣的示例。初步结果表明，传统的文本图像元数据提取方法提高了文本识别的性能。特别是，我们要解决的一个问题是书名和作者之间的模糊性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Unsupervised Approach for Automatic Discovery of Metadata in Document Images

The visual information contained in documents provides a rich set of features that can be exploited to increase its understanding. The typography, design or lexical properties of text constitute the clues that help us identify at a glance those data from other. In this paper, we present a methodology to identify, extract and automatically classify the metadata of the document covers. A problem associated with metadata discovery is the processing of the original document format. We propose the combination of two methods, maximally stable extremal regions (MSER) for detecting text in cover images with complex background, and conditional random fields (CRF) for logical labeling elements in the document. We show a selected set of visual and linguistic features used to train our model. As a necessary proof of concept we incorporated the methods in a desktop application and we executed some interesting examples. Preliminary results show a performance improvement in text recognition regarding traditional methods of metadata extraction for document images. In particular, a problem that we seek to solve is the ambiguity between the book title and the author.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 Fifteenth Mexican International Conference on Artificial Intelligence (MICAI)

自引率

0.00%

发文量