跨境产品检测中的标签信息识别方法与算法

Dunsheng Chen, Yinsheng Li, X. Liang
{"title":"跨境产品检测中的标签信息识别方法与算法","authors":"Dunsheng Chen, Yinsheng Li, X. Liang","doi":"10.1145/3371238.3371248","DOIUrl":null,"url":null,"abstract":"The images with fixed layouts, such as images from ID cards, driving licenses, and invoices can be recognized from prior knowledge[1]-[7]. However, The non-immobilized images, such as product labels at ports, is very difficult to be extracted structured data information from tag images because the formats and contents of tags in different countries and different product vary widely[8]. The process is complex and the error rate is high. This paper combines the characteristics of the Cross-Border Products label, overall format complex and simple local structure (top-to-down and left-to-right), and proposes a method for identifying and structuring port commodity label information. The method mainly establishes a template library of keyword and data unit information of commodity labels according to the port commodity classification and then separates the keyword and the data information from the multi-line text with accurate location information recognized by the OCR engine. Finally, the keyword and data are structured according to the local layout pattern between the keyword and the data, and the structured Cross-Border product information is obtained.","PeriodicalId":241191,"journal":{"name":"Proceedings of the 4th International Conference on Crowd Science and Engineering","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tag Information Recognition Approaches and Algorithms for Cross-Border Products Checking\",\"authors\":\"Dunsheng Chen, Yinsheng Li, X. Liang\",\"doi\":\"10.1145/3371238.3371248\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The images with fixed layouts, such as images from ID cards, driving licenses, and invoices can be recognized from prior knowledge[1]-[7]. However, The non-immobilized images, such as product labels at ports, is very difficult to be extracted structured data information from tag images because the formats and contents of tags in different countries and different product vary widely[8]. The process is complex and the error rate is high. This paper combines the characteristics of the Cross-Border Products label, overall format complex and simple local structure (top-to-down and left-to-right), and proposes a method for identifying and structuring port commodity label information. The method mainly establishes a template library of keyword and data unit information of commodity labels according to the port commodity classification and then separates the keyword and the data information from the multi-line text with accurate location information recognized by the OCR engine. Finally, the keyword and data are structured according to the local layout pattern between the keyword and the data, and the structured Cross-Border product information is obtained.\",\"PeriodicalId\":241191,\"journal\":{\"name\":\"Proceedings of the 4th International Conference on Crowd Science and Engineering\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 4th International Conference on Crowd Science and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3371238.3371248\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Conference on Crowd Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3371238.3371248","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

对于固定布局的图像,如身份证、驾照、发票等图像,可以通过先验知识进行识别[1]-[7]。然而,对于非固定化图像,如港口的产品标签,由于不同国家和不同产品的标签格式和内容差异很大,很难从标签图像中提取结构化数据信息[8]。过程复杂,错误率高。本文结合跨境商品标签整体格式复杂、局部结构简单(从上到下、从左到右)的特点,提出了一种口岸商品标签信息的识别和结构化方法。该方法主要是根据港口商品分类建立商品标签关键字和数据单元信息模板库,然后将关键字和数据信息从OCR引擎识别的具有准确位置信息的多行文本中分离出来。最后,根据关键词与数据之间的局部布局模式对关键词与数据进行结构化,得到结构化的跨境产品信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Tag Information Recognition Approaches and Algorithms for Cross-Border Products Checking
The images with fixed layouts, such as images from ID cards, driving licenses, and invoices can be recognized from prior knowledge[1]-[7]. However, The non-immobilized images, such as product labels at ports, is very difficult to be extracted structured data information from tag images because the formats and contents of tags in different countries and different product vary widely[8]. The process is complex and the error rate is high. This paper combines the characteristics of the Cross-Border Products label, overall format complex and simple local structure (top-to-down and left-to-right), and proposes a method for identifying and structuring port commodity label information. The method mainly establishes a template library of keyword and data unit information of commodity labels according to the port commodity classification and then separates the keyword and the data information from the multi-line text with accurate location information recognized by the OCR engine. Finally, the keyword and data are structured according to the local layout pattern between the keyword and the data, and the structured Cross-Border product information is obtained.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信