{"title":"结合空间背景后ocr地图图像","authors":"M. Namgung, Yao-Yi Chiang","doi":"10.1145/3557918.3565864","DOIUrl":null,"url":null,"abstract":"Extracting text from historical maps using Optical Character Recognition (OCR) engines often results in partially or incorrectly recognized words due to complex map content. Previous work utilizes lexical-based approaches with linguistic context or applies language models to correct OCR results for documents. However, these post-OCR methods cannot directly consider spatial relations of map text for correction. For example, \"Mississippi\" and \"River\" constitute the place phrase \"Mississippi River\" (linguistic relation), and near \"highway\", there are likely to exist intersected \"road\" to enter the \"highway\" (spatial relation). This paper presents a novel approach that exploits the spatial arrangement of map text using a contextual language model, BART [6] for post-processing of map text from OCR. The approach first structures word-level map text into sentences based on their spatial arrangement while preserving the spatial location of words constituting a place name and corrects imperfect OCR text using neighboring information. To train BART for capturing spatial relations in map text, we automatically generate large numbers of synthetic maps to fine-tune BART with location names and their spatial context. We conduct experiments on synthetic and real-world historical maps of various map styles and scales and show that the proposed method can achieve significant improvement over the commonly used lexical approach.","PeriodicalId":428859,"journal":{"name":"Proceedings of the 5th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Incorporating spatial context for post-OCR in map images\",\"authors\":\"M. Namgung, Yao-Yi Chiang\",\"doi\":\"10.1145/3557918.3565864\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Extracting text from historical maps using Optical Character Recognition (OCR) engines often results in partially or incorrectly recognized words due to complex map content. Previous work utilizes lexical-based approaches with linguistic context or applies language models to correct OCR results for documents. However, these post-OCR methods cannot directly consider spatial relations of map text for correction. For example, \\\"Mississippi\\\" and \\\"River\\\" constitute the place phrase \\\"Mississippi River\\\" (linguistic relation), and near \\\"highway\\\", there are likely to exist intersected \\\"road\\\" to enter the \\\"highway\\\" (spatial relation). This paper presents a novel approach that exploits the spatial arrangement of map text using a contextual language model, BART [6] for post-processing of map text from OCR. The approach first structures word-level map text into sentences based on their spatial arrangement while preserving the spatial location of words constituting a place name and corrects imperfect OCR text using neighboring information. To train BART for capturing spatial relations in map text, we automatically generate large numbers of synthetic maps to fine-tune BART with location names and their spatial context. We conduct experiments on synthetic and real-world historical maps of various map styles and scales and show that the proposed method can achieve significant improvement over the commonly used lexical approach.\",\"PeriodicalId\":428859,\"journal\":{\"name\":\"Proceedings of the 5th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 5th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3557918.3565864\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3557918.3565864","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Incorporating spatial context for post-OCR in map images
Extracting text from historical maps using Optical Character Recognition (OCR) engines often results in partially or incorrectly recognized words due to complex map content. Previous work utilizes lexical-based approaches with linguistic context or applies language models to correct OCR results for documents. However, these post-OCR methods cannot directly consider spatial relations of map text for correction. For example, "Mississippi" and "River" constitute the place phrase "Mississippi River" (linguistic relation), and near "highway", there are likely to exist intersected "road" to enter the "highway" (spatial relation). This paper presents a novel approach that exploits the spatial arrangement of map text using a contextual language model, BART [6] for post-processing of map text from OCR. The approach first structures word-level map text into sentences based on their spatial arrangement while preserving the spatial location of words constituting a place name and corrects imperfect OCR text using neighboring information. To train BART for capturing spatial relations in map text, we automatically generate large numbers of synthetic maps to fine-tune BART with location names and their spatial context. We conduct experiments on synthetic and real-world historical maps of various map styles and scales and show that the proposed method can achieve significant improvement over the commonly used lexical approach.