Verónica Romero, A. Toselli, Joan Andreu Sánchez, E. Vidal
{"title":"历史日记文献的手写抄写与关键词标注","authors":"Verónica Romero, A. Toselli, Joan Andreu Sánchez, E. Vidal","doi":"10.1109/DAS.2016.70","DOIUrl":null,"url":null,"abstract":"Historical records of daily activities provide an intriguing look into the historic life. These documents have interesting information, useful for demography studies and genealogical research. However, automatic processing of historical documents, has mostly been focused on single works of literature and less on daily records, which tend to have a distinct layout, structure, and vocabulary. This paper presents a study about the capability of state-of-the-art handwritten text recognition and key word spotting systems, when applied to this kind of documents. A relatively small set of handwritten birth records registered in Wien in the 16th century is used in the experiments. A word accuracy of about 70% and an AP of 0.74 are achieved for plain image transcription and key word spotting respectively. Taking into account the many difficulties exhibited by these handwritten documents, these preliminary results are quite encouraging.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Handwriting Transcription and Keyword Spotting in Historical Daily Records Documents\",\"authors\":\"Verónica Romero, A. Toselli, Joan Andreu Sánchez, E. Vidal\",\"doi\":\"10.1109/DAS.2016.70\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Historical records of daily activities provide an intriguing look into the historic life. These documents have interesting information, useful for demography studies and genealogical research. However, automatic processing of historical documents, has mostly been focused on single works of literature and less on daily records, which tend to have a distinct layout, structure, and vocabulary. This paper presents a study about the capability of state-of-the-art handwritten text recognition and key word spotting systems, when applied to this kind of documents. A relatively small set of handwritten birth records registered in Wien in the 16th century is used in the experiments. A word accuracy of about 70% and an AP of 0.74 are achieved for plain image transcription and key word spotting respectively. Taking into account the many difficulties exhibited by these handwritten documents, these preliminary results are quite encouraging.\",\"PeriodicalId\":197359,\"journal\":{\"name\":\"2016 12th IAPR Workshop on Document Analysis Systems (DAS)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 12th IAPR Workshop on Document Analysis Systems (DAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DAS.2016.70\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DAS.2016.70","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Handwriting Transcription and Keyword Spotting in Historical Daily Records Documents
Historical records of daily activities provide an intriguing look into the historic life. These documents have interesting information, useful for demography studies and genealogical research. However, automatic processing of historical documents, has mostly been focused on single works of literature and less on daily records, which tend to have a distinct layout, structure, and vocabulary. This paper presents a study about the capability of state-of-the-art handwritten text recognition and key word spotting systems, when applied to this kind of documents. A relatively small set of handwritten birth records registered in Wien in the 16th century is used in the experiments. A word accuracy of about 70% and an AP of 0.74 are achieved for plain image transcription and key word spotting respectively. Taking into account the many difficulties exhibited by these handwritten documents, these preliminary results are quite encouraging.