Wei Li, Stefan Neullens, Matthias Breier, Marcel Bosling, T. Pretz, D. Merhof
{"title":"印刷电路板图像信息检索的文本识别","authors":"Wei Li, Stefan Neullens, Matthias Breier, Marcel Bosling, T. Pretz, D. Merhof","doi":"10.1109/IECON.2014.7049016","DOIUrl":null,"url":null,"abstract":"In order to achieve an efficient and environment-friendly recycling of printed circuit boards (PCBs), a comprehensive analysis of their material composition is essential. Besides sophisticated chemical and physical methods for a direct material analysis, an indirect method based on information retrieval provides a less costly and more efficient alternative. During the process of information retrieval, PCBs and their components need to be recognized based on their appearance and the corresponding text information. Their material composition is then available through a pre-established database. Therefore, a practical text recognition is necessary for a successful data analysis prior to PCB recycling. Our paper is focusing on two key aspects of text recognition: binarization and final recognition of text objects using optical character recognition (OCR) engines. For binarization of text contents, a novel local thresholding method using an adaptive window size along with background estimation is presented. Several state-of-the-art algorithms and the proposed method were evaluated for comparing their binarization performance on text objects in PCB images. With respect to a data set containing manually created references, our novel method provides superior results. Furthermore, in contrast to previous work on text recognition, an additional evaluation of available open source OCR engines was conducted to asses technical limitations of OCR applications. We show that the quality of text recognition can be significantly improved if the binarization approach accounts for these technical limitations of OCR software. The presented method and results are expected to provide improved OCR performance also in other applications.","PeriodicalId":228897,"journal":{"name":"IECON 2014 - 40th Annual Conference of the IEEE Industrial Electronics Society","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Text recognition for information retrieval in images of printed circuit boards\",\"authors\":\"Wei Li, Stefan Neullens, Matthias Breier, Marcel Bosling, T. Pretz, D. Merhof\",\"doi\":\"10.1109/IECON.2014.7049016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to achieve an efficient and environment-friendly recycling of printed circuit boards (PCBs), a comprehensive analysis of their material composition is essential. Besides sophisticated chemical and physical methods for a direct material analysis, an indirect method based on information retrieval provides a less costly and more efficient alternative. During the process of information retrieval, PCBs and their components need to be recognized based on their appearance and the corresponding text information. Their material composition is then available through a pre-established database. Therefore, a practical text recognition is necessary for a successful data analysis prior to PCB recycling. Our paper is focusing on two key aspects of text recognition: binarization and final recognition of text objects using optical character recognition (OCR) engines. For binarization of text contents, a novel local thresholding method using an adaptive window size along with background estimation is presented. Several state-of-the-art algorithms and the proposed method were evaluated for comparing their binarization performance on text objects in PCB images. With respect to a data set containing manually created references, our novel method provides superior results. Furthermore, in contrast to previous work on text recognition, an additional evaluation of available open source OCR engines was conducted to asses technical limitations of OCR applications. We show that the quality of text recognition can be significantly improved if the binarization approach accounts for these technical limitations of OCR software. The presented method and results are expected to provide improved OCR performance also in other applications.\",\"PeriodicalId\":228897,\"journal\":{\"name\":\"IECON 2014 - 40th Annual Conference of the IEEE Industrial Electronics Society\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IECON 2014 - 40th Annual Conference of the IEEE Industrial Electronics Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IECON.2014.7049016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IECON 2014 - 40th Annual Conference of the IEEE Industrial Electronics Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IECON.2014.7049016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Text recognition for information retrieval in images of printed circuit boards
In order to achieve an efficient and environment-friendly recycling of printed circuit boards (PCBs), a comprehensive analysis of their material composition is essential. Besides sophisticated chemical and physical methods for a direct material analysis, an indirect method based on information retrieval provides a less costly and more efficient alternative. During the process of information retrieval, PCBs and their components need to be recognized based on their appearance and the corresponding text information. Their material composition is then available through a pre-established database. Therefore, a practical text recognition is necessary for a successful data analysis prior to PCB recycling. Our paper is focusing on two key aspects of text recognition: binarization and final recognition of text objects using optical character recognition (OCR) engines. For binarization of text contents, a novel local thresholding method using an adaptive window size along with background estimation is presented. Several state-of-the-art algorithms and the proposed method were evaluated for comparing their binarization performance on text objects in PCB images. With respect to a data set containing manually created references, our novel method provides superior results. Furthermore, in contrast to previous work on text recognition, an additional evaluation of available open source OCR engines was conducted to asses technical limitations of OCR applications. We show that the quality of text recognition can be significantly improved if the binarization approach accounts for these technical limitations of OCR software. The presented method and results are expected to provide improved OCR performance also in other applications.