Wael Khallouli, Raphael Pamie-George, Samuel F. Kovacic, A. Sousa-Poza, M. Canan, Jiang Li
{"title":"Leveraging Transfer Learning and GAN Models for OCR from Engineering Documents","authors":"Wael Khallouli, Raphael Pamie-George, Samuel F. Kovacic, A. Sousa-Poza, M. Canan, Jiang Li","doi":"10.1109/aiiot54504.2022.9817319","DOIUrl":null,"url":null,"abstract":"Digital engineering, the digital transformation of engineering practice, is profoundly changing the traditional engineering practice towards the fast integration of digital technologies and digital models in the engineering processes' life cycles. The traditional engineering process heavily relies on static engineering documents (e.g., spreadsheets, technical drawings, and scanned documents) to store and share information across the engineering process. A critical task in digital engineering is to extract relevant textual information from traditional engineering documents into machine-readable and editable formats. This paper explores deep learning models and OCR methods to effectively extract textual information from engineering documents collected by the NAVY's military sealift command division. We propose a deep learning-based optical character recognition (OCR) framework for this task, which integrates several modules including a pre-trained text detection model, a fine-tuned OCR algorithm, and a deep generative model to augment data for the fine-tuning. Experimental results showed that the fine-tuning method significantly improved word accuracies of OCR models from 60%-70% to 90% and above. Furthermore, the deep adversarial generative approach had proved to be an effective model for data augmentation.","PeriodicalId":409264,"journal":{"name":"2022 IEEE World AI IoT Congress (AIIoT)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE World AI IoT Congress (AIIoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/aiiot54504.2022.9817319","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Digital engineering, the digital transformation of engineering practice, is profoundly changing the traditional engineering practice towards the fast integration of digital technologies and digital models in the engineering processes' life cycles. The traditional engineering process heavily relies on static engineering documents (e.g., spreadsheets, technical drawings, and scanned documents) to store and share information across the engineering process. A critical task in digital engineering is to extract relevant textual information from traditional engineering documents into machine-readable and editable formats. This paper explores deep learning models and OCR methods to effectively extract textual information from engineering documents collected by the NAVY's military sealift command division. We propose a deep learning-based optical character recognition (OCR) framework for this task, which integrates several modules including a pre-trained text detection model, a fine-tuned OCR algorithm, and a deep generative model to augment data for the fine-tuning. Experimental results showed that the fine-tuning method significantly improved word accuracies of OCR models from 60%-70% to 90% and above. Furthermore, the deep adversarial generative approach had proved to be an effective model for data augmentation.