Taehan Lee , Hyeyeon Choi , Bum Jun Kim , Hyeonah Jang , Donggeon Lee , Sang Woo Kim
{"title":"Font conversion for steel product number recognition: A conditioned diffusion model approach","authors":"Taehan Lee , Hyeyeon Choi , Bum Jun Kim , Hyeonah Jang , Donggeon Lee , Sang Woo Kim","doi":"10.1016/j.aei.2025.103368","DOIUrl":null,"url":null,"abstract":"<div><div>In the steel manufacturing industry, it is crucial to automatically recognize semi-finished product numbers to avoid mix-ups and ensure that each product is processed according to its specific material properties. The advancement of deep learning has significantly improved the recognition of steel product numbers, particularly those printed by machines with consistent thickness and spacing, resulting in high recognition accuracy. Conversely, handwritten numbers by workers are often challenging to recognize due to varying thickness, spacing, being too thin, partially erased, or overwritten with scribbles. This inconsistency causes low recognition accuracy of steel product number recognition models for fonts with insufficient training data or fonts not seen during training. The models must be updated periodically whenever a new font is used and remain vulnerable to new fonts until sufficient data is accumulated and updated. In this paper, we propose a Font Changer that converts various fonts into a representative font to address these issues. Font Changer is designed to learn the trajectory from a Gaussian distribution to the data distribution of images generated in a representative font with clean background. Font Changer, composed of a conditional image encoder and a diffusion model, extracts location, size, and number information from the original image containing the steel product number. The extracted information is then used as a condition for the diffusion model, allowing it to generate the closest sample within the data distribution. Images processed by the Font Changer exhibit uniformity, ensuring the consistency of steel product number images. Experiments demonstrate that the Font Changer enhances number recognition by removing background noise and converting even messy and damaged images into a consistent representative font. Our proposed method advances the steel manufacturing industry by standardizing fonts in work environments with diverse handwritten fonts.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"65 ","pages":"Article 103368"},"PeriodicalIF":8.0000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034625002617","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In the steel manufacturing industry, it is crucial to automatically recognize semi-finished product numbers to avoid mix-ups and ensure that each product is processed according to its specific material properties. The advancement of deep learning has significantly improved the recognition of steel product numbers, particularly those printed by machines with consistent thickness and spacing, resulting in high recognition accuracy. Conversely, handwritten numbers by workers are often challenging to recognize due to varying thickness, spacing, being too thin, partially erased, or overwritten with scribbles. This inconsistency causes low recognition accuracy of steel product number recognition models for fonts with insufficient training data or fonts not seen during training. The models must be updated periodically whenever a new font is used and remain vulnerable to new fonts until sufficient data is accumulated and updated. In this paper, we propose a Font Changer that converts various fonts into a representative font to address these issues. Font Changer is designed to learn the trajectory from a Gaussian distribution to the data distribution of images generated in a representative font with clean background. Font Changer, composed of a conditional image encoder and a diffusion model, extracts location, size, and number information from the original image containing the steel product number. The extracted information is then used as a condition for the diffusion model, allowing it to generate the closest sample within the data distribution. Images processed by the Font Changer exhibit uniformity, ensuring the consistency of steel product number images. Experiments demonstrate that the Font Changer enhances number recognition by removing background noise and converting even messy and damaged images into a consistent representative font. Our proposed method advances the steel manufacturing industry by standardizing fonts in work environments with diverse handwritten fonts.
期刊介绍:
Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.