合并多种文字识别结果的问题

IF 0.4 Q4 INFORMATION SCIENCE & LIBRARY SCIENCE

Scientific and Technical Information Processing Pub Date : 2024-03-05 DOI:10.3103/s0147688223050027

V. V. Arlazarov

{"title":"合并多种文字识别结果的问题","authors":"V. V. Arlazarov","doi":"10.3103/s0147688223050027","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>In this paper, the task of combining recognition results from multiple images is considered. Systems in which such problems occur are analyzed, and some known approaches are described. It should be noted that currently there is no unified approach that could be used to solve the combination problem for increasing text recognition accuracy using multiple images or in a video stream. As an example, a comparative study of three different approaches to the combination of per-frame recognition results of identity document fields is presented, and it is demonstrated that different approaches may be advantageous for different parts of a data set, while a selection of the potential best single result still significantly outperforms all of the analyzed methods. The task of the per-frame combination of recognition results is an important component in video stream recognition systems and requires careful consideration and the formulation of general approaches that would be applicable to various domains.</p>","PeriodicalId":43962,"journal":{"name":"Scientific and Technical Information Processing","volume":"28 1","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Problems of Combining Multiple Text Recognition Results\",\"authors\":\"V. V. Arlazarov\",\"doi\":\"10.3103/s0147688223050027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">Abstract</h3><p>In this paper, the task of combining recognition results from multiple images is considered. Systems in which such problems occur are analyzed, and some known approaches are described. It should be noted that currently there is no unified approach that could be used to solve the combination problem for increasing text recognition accuracy using multiple images or in a video stream. As an example, a comparative study of three different approaches to the combination of per-frame recognition results of identity document fields is presented, and it is demonstrated that different approaches may be advantageous for different parts of a data set, while a selection of the potential best single result still significantly outperforms all of the analyzed methods. The task of the per-frame combination of recognition results is an important component in video stream recognition systems and requires careful consideration and the formulation of general approaches that would be applicable to various domains.</p>\",\"PeriodicalId\":43962,\"journal\":{\"name\":\"Scientific and Technical Information Processing\",\"volume\":\"28 1\",\"pages\":\"\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2024-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific and Technical Information Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3103/s0147688223050027\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific and Technical Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3103/s0147688223050027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}

引用次数: 0

摘要

摘要本文考虑了将多幅图像的识别结果进行组合的任务。本文分析了出现此类问题的系统，并介绍了一些已知的方法。需要指出的是，目前还没有一种统一的方法可以用来解决组合问题，以提高使用多幅图像或视频流的文本识别准确率。举例来说，本文介绍了对身份文件字段的每帧识别结果进行组合的三种不同方法的比较研究，结果表明，不同的方法可能对数据集的不同部分具有优势，而选择潜在的最佳单一结果仍然明显优于所有分析方法。按帧组合识别结果的任务是视频流识别系统的重要组成部分，需要仔细考虑并制定适用于不同领域的通用方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Problems of Combining Multiple Text Recognition Results

查看原文本刊更多论文

Problems of Combining Multiple Text Recognition Results

Abstract

In this paper, the task of combining recognition results from multiple images is considered. Systems in which such problems occur are analyzed, and some known approaches are described. It should be noted that currently there is no unified approach that could be used to solve the combination problem for increasing text recognition accuracy using multiple images or in a video stream. As an example, a comparative study of three different approaches to the combination of per-frame recognition results of identity document fields is presented, and it is demonstrated that different approaches may be advantageous for different parts of a data set, while a selection of the potential best single result still significantly outperforms all of the analyzed methods. The task of the per-frame combination of recognition results is an important component in video stream recognition systems and requires careful consideration and the formulation of general approaches that would be applicable to various domains.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Scientific and Technical Information Processing INFORMATION SCIENCE & LIBRARY SCIENCE-

CiteScore

1.00

自引率

42.90%

发文量

期刊介绍： Scientific and Technical Information Processing is a refereed journal that covers all aspects of management and use of information technology in libraries and archives, information centres, and the information industry in general. Emphasis is on practical applications of new technologies and techniques for information analysis and processing.