基于模板卡的光学字符识别文字提取技术

Panas Thongtaweechaikij, Piyawat Tangpong, J. Inthiam, W. Tangsuksant
{"title":"基于模板卡的光学字符识别文字提取技术","authors":"Panas Thongtaweechaikij, Piyawat Tangpong, J. Inthiam, W. Tangsuksant","doi":"10.1109/RESTCON60981.2024.10463567","DOIUrl":null,"url":null,"abstract":"This study evaluates Optical Character Recognition's (OCR) effectiveness in extracting and organizing data from student cards. Assessing diverse OCR techniques, it aims to identify optimal methods for accurate text extraction, considering different formats and languages. The research investigates OCR's impact on information retrieval, analyzing its integration into databases for improved searchability and usability. Our proposed method presents the pre-processing with OCR process including the SIFT, KNN feature matching, MSER technique for noise detection and image transformation. For the experiment, all student cards in King Mongkut’s University of Technology North Bangkok capturing by smartphone, which the resolution of camera is greater than 2 megapixel. This research compares the different technique between traditional tesseract OCR and our proposed method by setting 50% and 70% of Intersection over Union (IoU), The experiment result shows that our proposed method with 70% of IoU has the highest accuracy as 97.36%. According to the result, the proposed illustrate the feasible method for our system.","PeriodicalId":518254,"journal":{"name":"2024 1st International Conference on Robotics, Engineering, Science, and Technology (RESTCON)","volume":"169 3","pages":"188-192"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Text Extraction by Optical Character Recognition-Based on the Template Card\",\"authors\":\"Panas Thongtaweechaikij, Piyawat Tangpong, J. Inthiam, W. Tangsuksant\",\"doi\":\"10.1109/RESTCON60981.2024.10463567\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study evaluates Optical Character Recognition's (OCR) effectiveness in extracting and organizing data from student cards. Assessing diverse OCR techniques, it aims to identify optimal methods for accurate text extraction, considering different formats and languages. The research investigates OCR's impact on information retrieval, analyzing its integration into databases for improved searchability and usability. Our proposed method presents the pre-processing with OCR process including the SIFT, KNN feature matching, MSER technique for noise detection and image transformation. For the experiment, all student cards in King Mongkut’s University of Technology North Bangkok capturing by smartphone, which the resolution of camera is greater than 2 megapixel. This research compares the different technique between traditional tesseract OCR and our proposed method by setting 50% and 70% of Intersection over Union (IoU), The experiment result shows that our proposed method with 70% of IoU has the highest accuracy as 97.36%. According to the result, the proposed illustrate the feasible method for our system.\",\"PeriodicalId\":518254,\"journal\":{\"name\":\"2024 1st International Conference on Robotics, Engineering, Science, and Technology (RESTCON)\",\"volume\":\"169 3\",\"pages\":\"188-192\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2024 1st International Conference on Robotics, Engineering, Science, and Technology (RESTCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RESTCON60981.2024.10463567\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 1st International Conference on Robotics, Engineering, Science, and Technology (RESTCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RESTCON60981.2024.10463567","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本研究评估了光学字符识别技术(OCR)在提取和整理学生证数据方面的有效性。本研究评估了各种 OCR 技术,旨在确定准确提取文本的最佳方法,同时考虑到不同的格式和语言。研究调查了 OCR 对信息检索的影响,分析了将其整合到数据库中以提高可搜索性和可用性的方法。我们提出的方法介绍了 OCR 的预处理过程,包括 SIFT、KNN 特征匹配、用于噪声检测和图像转换的 MSER 技术。在实验中,曼谷北蒙库国王科技大学的所有学生证都是用智能手机拍摄的,摄像头的分辨率大于 200 万像素。本研究通过设置 50%和 70%的交叉联合(IoU),比较了传统的魔方 OCR 与我们提出的方法之间的不同技术,实验结果表明,我们提出的方法(IoU 为 70%)的准确率最高,达到 97.36%。根据这一结果,我们提出的方法说明我们的系统是可行的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Text Extraction by Optical Character Recognition-Based on the Template Card
This study evaluates Optical Character Recognition's (OCR) effectiveness in extracting and organizing data from student cards. Assessing diverse OCR techniques, it aims to identify optimal methods for accurate text extraction, considering different formats and languages. The research investigates OCR's impact on information retrieval, analyzing its integration into databases for improved searchability and usability. Our proposed method presents the pre-processing with OCR process including the SIFT, KNN feature matching, MSER technique for noise detection and image transformation. For the experiment, all student cards in King Mongkut’s University of Technology North Bangkok capturing by smartphone, which the resolution of camera is greater than 2 megapixel. This research compares the different technique between traditional tesseract OCR and our proposed method by setting 50% and 70% of Intersection over Union (IoU), The experiment result shows that our proposed method with 70% of IoU has the highest accuracy as 97.36%. According to the result, the proposed illustrate the feasible method for our system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信