Y. Chaitra, R. Dinesh, M. Jeevan, M. Arpitha, V. Aishwarya, K. Akshitha
{"title":"An Impact of YOLOv5 on Text Detection and Recognition System using TesseractOCR in Images/Video Frames","authors":"Y. Chaitra, R. Dinesh, M. Jeevan, M. Arpitha, V. Aishwarya, K. Akshitha","doi":"10.1109/ICDSIS55133.2022.9915927","DOIUrl":null,"url":null,"abstract":"Text detection and recognition in images and videos are significant research areas in computer vision. A computer vision technology is used for smart city real-time traffic monitoring, and a security camera can simultaneously record the license plate information of suspected vehicles. The challenging task here is detecting the text images that are arbitrary oriented, such as aerial photographs and scene texts. Most complementary text detection and recognition methods are designed to identify text in images that are clear in the background and near-horizontal text. However, those methods will not be effective in detecting text in complex images and video streams. To address this issue, we propose a system that detects the text images using the YOLOv5s model, which effectively trains small-scale images and YOLOv5x for largescale images. TesseractOCR recognizes the detected text by converting the image to a string and storing it in CSV format. The experiment was carried out for ICDAR2013, ICDAR2015, and YVT images/frames. The results indicate that the proposed method using YOLOv5x effectively detects images/video frames with reasonably good accuracy, and the recognition rate is suitable for a near-horizontal image using TesseractOCR.","PeriodicalId":178360,"journal":{"name":"2022 IEEE International Conference on Data Science and Information System (ICDSIS)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Data Science and Information System (ICDSIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSIS55133.2022.9915927","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Text detection and recognition in images and videos are significant research areas in computer vision. A computer vision technology is used for smart city real-time traffic monitoring, and a security camera can simultaneously record the license plate information of suspected vehicles. The challenging task here is detecting the text images that are arbitrary oriented, such as aerial photographs and scene texts. Most complementary text detection and recognition methods are designed to identify text in images that are clear in the background and near-horizontal text. However, those methods will not be effective in detecting text in complex images and video streams. To address this issue, we propose a system that detects the text images using the YOLOv5s model, which effectively trains small-scale images and YOLOv5x for largescale images. TesseractOCR recognizes the detected text by converting the image to a string and storing it in CSV format. The experiment was carried out for ICDAR2013, ICDAR2015, and YVT images/frames. The results indicate that the proposed method using YOLOv5x effectively detects images/video frames with reasonably good accuracy, and the recognition rate is suitable for a near-horizontal image using TesseractOCR.