{"title":"Enhanced Character Segmentation for Multi-Language Data Plate in Substation Transformer Based on Connected Component Analysis","authors":"Jieling Zheng, Xiren Miao, Shih-Hau Fang, Jing Chen, Hao Jiang","doi":"10.1109/ICARCV.2018.8581282","DOIUrl":null,"url":null,"abstract":"Intelligent inspection in the substation transformer using optical character recognizer has been developing rapidly. Character segmentation from the text line of data plate is an important step for localization and recognition of electrical equipment. However, on-site character segmentation is challenging if the data plate contains multiple languages, especially when the width between Chinese and non-Chinese character differs significantly and the complex environments cause the light reflection and fading. This paper proposes a new method, based on analyzing the connected component and Chinese character's structure, to segment characters from multi-language data plate of substations. The proposed method uses the combination of the HSV color space and multi-scale MSRCP to reduce the effect of illumination and complex background. The proposed method utilized the width of each kind character, the interval between characters and the relationship within the left-right structure Chinese character to improve the segmentation accuracy. Experimental results show that the text lines from the data plate in substation transformer, including Chinese, English, Roman numerals, Arabic numerals and symbols, can be segmented correctly. Results show that the proposed method outperforms two existing character segmentation methods and achieves 99.4% precision in the multi-language data plate dataset.","PeriodicalId":395380,"journal":{"name":"2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICARCV.2018.8581282","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Intelligent inspection in the substation transformer using optical character recognizer has been developing rapidly. Character segmentation from the text line of data plate is an important step for localization and recognition of electrical equipment. However, on-site character segmentation is challenging if the data plate contains multiple languages, especially when the width between Chinese and non-Chinese character differs significantly and the complex environments cause the light reflection and fading. This paper proposes a new method, based on analyzing the connected component and Chinese character's structure, to segment characters from multi-language data plate of substations. The proposed method uses the combination of the HSV color space and multi-scale MSRCP to reduce the effect of illumination and complex background. The proposed method utilized the width of each kind character, the interval between characters and the relationship within the left-right structure Chinese character to improve the segmentation accuracy. Experimental results show that the text lines from the data plate in substation transformer, including Chinese, English, Roman numerals, Arabic numerals and symbols, can be segmented correctly. Results show that the proposed method outperforms two existing character segmentation methods and achieves 99.4% precision in the multi-language data plate dataset.