{"title":"Script identification in multilingual environment: a survey in recent years","authors":"Yaowei Yang, Elham Eli, Alimjan Aysa, Kurban Ubul","doi":"10.1007/s10462-025-11194-x","DOIUrl":null,"url":null,"abstract":"<div><p>Multilingualism is an important trend in the field of optical character recognition (OCR). In a multilingual environment, the task of script identification often combines with other tasks to complete multilingual work jointly. As a front-end function of a multilingual OCR system, it automatically identifies the language of the text image and further recognizes text in multilingual engines. In reality, script identification plays a major role, especially in multilingual scene understanding, as well as intelligent document analysis and recognition. This survey introduces the technology of script identification and summarizes the related work developed in this field from 2017 to date, including traditional learning, deep learning, and available datasets. Based on a comprehensive analysis of existing work, it provides a new survey for researchers to grasp recent script identification work. By discussing the problems that need to be solved, it can lay the foundation for related research work and activities.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11194-x.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11194-x","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Multilingualism is an important trend in the field of optical character recognition (OCR). In a multilingual environment, the task of script identification often combines with other tasks to complete multilingual work jointly. As a front-end function of a multilingual OCR system, it automatically identifies the language of the text image and further recognizes text in multilingual engines. In reality, script identification plays a major role, especially in multilingual scene understanding, as well as intelligent document analysis and recognition. This survey introduces the technology of script identification and summarizes the related work developed in this field from 2017 to date, including traditional learning, deep learning, and available datasets. Based on a comprehensive analysis of existing work, it provides a new survey for researchers to grasp recent script identification work. By discussing the problems that need to be solved, it can lay the foundation for related research work and activities.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.