Abdelrahman Abdallah, Daniel Eberharter, Zoe Pfister, Adam Jatowt
{"title":"A survey of recent approaches to form understanding in scanned documents","authors":"Abdelrahman Abdallah, Daniel Eberharter, Zoe Pfister, Adam Jatowt","doi":"10.1007/s10462-024-11000-0","DOIUrl":null,"url":null,"abstract":"<div><p>This paper presents a comprehensive survey of over 100 research works on the topic of form understanding in the context of scanned documents. We delve into recent advancements and breakthroughs in the field, with particular focus on transformer-based models, which have been shown to improve performance in form understanding tasks by up to 25% in accuracy compared to traditional methods. Our research methodology involves an in-depth analysis of popular documents and trends over the last decade, including 15 state-of-the-art models and 10 benchmark datasets. By examining these works, we offer novel insights into the evolution of this domain. Specifically, we highlight how transformers have revolutionized form-understanding techniques by enhancing the ability to process noisy scanned documents with significant improvements in OCR accuracy. Furthermore, we present an overview of the most relevant datasets, such as FUNSD, CORD, and SROIE, which serve as benchmarks for evaluating the performance of the models. By comparing the capabilities of these models and reporting an average improvement of 10–15% in key form extraction tasks, we aim to provide researchers and practitioners with useful guidance in selecting the most suitable solutions for their form understanding applications.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"57 12","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-11000-0.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-11000-0","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents a comprehensive survey of over 100 research works on the topic of form understanding in the context of scanned documents. We delve into recent advancements and breakthroughs in the field, with particular focus on transformer-based models, which have been shown to improve performance in form understanding tasks by up to 25% in accuracy compared to traditional methods. Our research methodology involves an in-depth analysis of popular documents and trends over the last decade, including 15 state-of-the-art models and 10 benchmark datasets. By examining these works, we offer novel insights into the evolution of this domain. Specifically, we highlight how transformers have revolutionized form-understanding techniques by enhancing the ability to process noisy scanned documents with significant improvements in OCR accuracy. Furthermore, we present an overview of the most relevant datasets, such as FUNSD, CORD, and SROIE, which serve as benchmarks for evaluating the performance of the models. By comparing the capabilities of these models and reporting an average improvement of 10–15% in key form extraction tasks, we aim to provide researchers and practitioners with useful guidance in selecting the most suitable solutions for their form understanding applications.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.