Proceedings of the 7th International Workshop on Historical Document Imaging and Processing最新文献

筛选
英文 中文
Investigations on Self-supervised Learning for Script-, Font-type, and Location Classification on Historical Documents 历史文献文字、字型、位置分类的自监督学习研究
Johan Zenk, Florian Kordon, Martin Mayr, Mathias Seuret, V. Christlein
{"title":"Investigations on Self-supervised Learning for Script-, Font-type, and Location Classification on Historical Documents","authors":"Johan Zenk, Florian Kordon, Martin Mayr, Mathias Seuret, V. Christlein","doi":"10.1145/3604951.3605519","DOIUrl":"https://doi.org/10.1145/3604951.3605519","url":null,"abstract":"In the context of automated classification of historical documents, we investigate three contemporary self-supervised learning (SSL) techniques (SimSiam, Dino, and VICReg) for the pre-training of three different document analysis tasks, namely script-type, font-type, and location classification. Our study draws samples from multiple datasets that contain images of manuscripts, prints, charters, and letters. The representations derived via pre-text training are taken as inputs for k-NN classification and a parametric linear classifier. The latter is placed atop the pre-trained backbones to enable fine-tuning of the entire network to further improve the classification by exploiting task-specific label data. The network’s final performance is assessed via independent test sets obtained from the ICDAR2021 Competition on Historical Document Classification. We empirically show that representations learned with SSL are significantly better suited for subsequent document classification than features generated by commonly used transfer learning on ImageNet.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121022906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gauging the Limitations of Natural Language Supervised Text-Image Metrics Learning by Iconclass Visual Concepts 用Iconclass视觉概念衡量自然语言监督的文本-图像度量学习的局限性
Kai Labusch, Clemens Neudecker
{"title":"Gauging the Limitations of Natural Language Supervised Text-Image Metrics Learning by Iconclass Visual Concepts","authors":"Kai Labusch, Clemens Neudecker","doi":"10.1145/3604951.3605516","DOIUrl":"https://doi.org/10.1145/3604951.3605516","url":null,"abstract":"Identification of images that are close to each other in terms of their iconographical meaning requires an applicable distance measure for text-image or image-image pairs. To obtain such a measure of distance, we finetune a group of contrastive loss based text-to-image similarity models (MS-CLIP) with respect to a large number of Iconclass visual concepts by means of natural language supervised learning. We show that there are certain Iconclass concepts that actually can be learned by the models whereas other visual concepts cannot be learned. We hypothesize that the visual concepts that can be learned more easily are intrinsically different from those that are more difficult to learn and that these qualitative differences can provide a valuable orientation for future research directions in text-to-image similarity learning.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114424739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classifying The Scripts of Aramaic Incantation Bowls 亚拉姆语咒语碗的文字分类
Said Naamneh, Nour Atamni, Boraq Madi, Daria Vasyutinsky Shapira, Irina Rabaev, Jihad El-Sana, Shoshana Boardman
{"title":"Classifying The Scripts of Aramaic Incantation Bowls","authors":"Said Naamneh, Nour Atamni, Boraq Madi, Daria Vasyutinsky Shapira, Irina Rabaev, Jihad El-Sana, Shoshana Boardman","doi":"10.1145/3604951.3605510","DOIUrl":"https://doi.org/10.1145/3604951.3605510","url":null,"abstract":"Aramaic incantation bowls are a magical object commonly used in Sasanian Mesopotamia (the region that includes modern-day Iraq and Iran) between the 4th and 7th centuries CE. These bowls were typically made of clay and inscribed with incantations in three dialects of Aramaic, the languages widely spoken in the region then. This paper focuses on bowls written in Jewish Babylonian Aramaic. The purpose of these bowls was to protect the homes of their owners from evil spirits and demons. The inscriptions on the bowls were most often written in a spiral fashion and often included the names of various demons and invocations of protective spirits and angels, alongside the names and family relationships of the clients, Biblical quotations, and other interesting material. The bowls were buried upside down beneath the floor of a home so that the incantations faced downward towards the underworld. This study tackles the problem of automatic classification of the script style of incantation bowls. To this end, we prepare and introduce a new dataset of incantation bowl images from the 4th to 7th centuries CE. We experiment with and compare several Siamese-based architectures, and introduce a new Multi-Level-of-Detail architecture, which extracts features at different scales. Our results establish baselines for future research and make valuable contributions to ongoing research addressing challenges in working with ancient artifact images.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123772409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-step sequence transformer based method for Cham to Latin script transliteration 基于两步序列变换的占语到拉丁字母音译方法
Tien-Nam Nguyen, J. Burie, Thi-Lan Le, Anne-Valérie Schweyer
{"title":"Two-step sequence transformer based method for Cham to Latin script transliteration","authors":"Tien-Nam Nguyen, J. Burie, Thi-Lan Le, Anne-Valérie Schweyer","doi":"10.1145/3604951.3605525","DOIUrl":"https://doi.org/10.1145/3604951.3605525","url":null,"abstract":"Fusion information between visual and textual information is an interesting way to better represent the features. In this work, we propose a method for the text line transliteration of Cham manuscripts by combining visual and textual modality. Instead of using a standard approach that directly recognizes the words in the image, we split the problem into two steps. Firstly, we propose a scenario for recognition where similar characters are considered as unique characters, then we use the transformer model which considers both visual and context information to adjust the prediction when it concerns similar characters to be able to distinguish them. Based on this two-step strategy, the proposed method consists of a sequence to sequence model and a multi-modal transformer. Thus, we can take advantage of both the sequence-to-sequence model and the transformer model. Extensive experiments prove that the proposed method outperforms the approaches of the literature on our Cham manuscripts dataset.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132429168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Study of historical Byzantine seal images: the BHAI project for computer-based sigillography 历史拜占庭印章图像的研究:基于计算机的符号学的BHAI项目
Victoria Eyharabide, Laurence Likforman-Sulem, Lucia Maria Orlandi, Alexandre Binoux, Theophile Rageau, Qijia Huang, A. Fiandrotti, Beatrice Caseau, Isabelle Bloch
{"title":"Study of historical Byzantine seal images: the BHAI project for computer-based sigillography","authors":"Victoria Eyharabide, Laurence Likforman-Sulem, Lucia Maria Orlandi, Alexandre Binoux, Theophile Rageau, Qijia Huang, A. Fiandrotti, Beatrice Caseau, Isabelle Bloch","doi":"10.1145/3604951.3605523","DOIUrl":"https://doi.org/10.1145/3604951.3605523","url":null,"abstract":"BHAI 1 (Byzantine Hybrid Artificial Intelligence) is the first project based on artificial intelligence dedicated to Byzantine seals. The scientific consortium comprises a multidisciplinary team involving historians specialized in the Byzantine period, specialists in sigillography, and computer science experts. This article describes the main objectives of this project: data acquisition of seal images, text and iconography recognition, seal dating, as well as our current achievements and first results on character recognition and spatial analysis of personages.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123746170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid CNN-Transformer model for Historical Document Image Binarization 历史文献图像二值化的CNN-Transformer混合模型
V. Rezanezhad, Konstantin Baierer, Clemens Neudecker
{"title":"A hybrid CNN-Transformer model for Historical Document Image Binarization","authors":"V. Rezanezhad, Konstantin Baierer, Clemens Neudecker","doi":"10.1145/3604951.3605508","DOIUrl":"https://doi.org/10.1145/3604951.3605508","url":null,"abstract":"Document image binarization is one of the main preprocessing steps in document image analysis for text recognition. Noise, faint characters, bad scanning conditions, uneven lighting or paper aging can cause artifacts that negatively impact text recognition algorithms. The task of binarization is to segment the foreground (text) from these degradations in order to improve optical character recognition (OCR) results. Convolutional Neural Networks (CNNs) are one popular method for binarization. But they suffer from focusing on the local context in a document image. We have applied a hybrid CNN-Transformer model to convert a document image into a binary output. For the model training, we used datasets from the Document Image Binarization Contests (DIBCO). For the datasets DIBCO-2012, DIBCO-2017 and DIBCO-2018, our model outperforms the state-of-the-art algorithms.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125099427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Homer restored: Virtual reconstruction of Papyrus Bodmer 1 荷马还原:莎草纸的虚拟重建1
Simon Perrin, Léopold Cudilla, Yejing Xie, H. Mouchère, Isabelle Marthot-Santaniello
{"title":"Homer restored: Virtual reconstruction of Papyrus Bodmer 1","authors":"Simon Perrin, Léopold Cudilla, Yejing Xie, H. Mouchère, Isabelle Marthot-Santaniello","doi":"10.1145/3604951.3605518","DOIUrl":"https://doi.org/10.1145/3604951.3605518","url":null,"abstract":"In this paper, we propose a complete method to reconstruct a damaged piece of papyrus using its image annotated at the character level and the original ancient Greek text (known otherwise). Our reconstruction allows us to recreate the written surface, making it readable and consistent with the original one. Our method is in two stages. First, the text is reconstructed by pasting character patches in their possible locations. Second, we reconstruct the background of the papyrus by applying inpainting methods. Two different inpainting techniques are tested in this article, one traditional and one using a GAN. This global reconstruction method is applied on a piece of Papyrus Bodmer 1. The results are evaluated visually by the authors of the paper and by researchers in papyrology. This reconstruction allows historians to investigate new paths on the topic of writing culture and materiality while it significantly improves the ability of non specialists to picture what this papyrus, and ancient books in general, could have looked like in Antiquity.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134501182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NAME – A Rich XML Format for Named Entity and Relation Tagging 用于命名实体和关系标记的富XML格式
C. Clausner, S. Pletschacher, A. Antonacopoulos
{"title":"NAME – A Rich XML Format for Named Entity and Relation Tagging","authors":"C. Clausner, S. Pletschacher, A. Antonacopoulos","doi":"10.1145/3604951.3605521","DOIUrl":"https://doi.org/10.1145/3604951.3605521","url":null,"abstract":"We present NAME XML, a schema for named entities and relations in documents. The standout features are: option to reference a variety of document formats (such as PAGE XML or plain text), support of entity hierarchies, custom entity types via ontologies, more expressivity due to disambiguation of base entities and entity attributes (e.g. “person” and “person name”), and relations between entities that can be directed or undirected. We describe the format in detail, show examples, and discuss real-world use cases.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134394575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Named Entity Recognition for Holocaust Testimonies through Pseudo Labelling and Transformer-based Models 通过伪标签和基于变压器的模型增强大屠杀证词的命名实体识别
Isuri Anuradha Nanomi Arachchige, L. Ha, R. Mitkov, Johannes-Dieter Steinert
{"title":"Enhancing Named Entity Recognition for Holocaust Testimonies through Pseudo Labelling and Transformer-based Models","authors":"Isuri Anuradha Nanomi Arachchige, L. Ha, R. Mitkov, Johannes-Dieter Steinert","doi":"10.1145/3604951.3605514","DOIUrl":"https://doi.org/10.1145/3604951.3605514","url":null,"abstract":"The Holocaust was a tragic and catastrophic event in World War II (WWII) history that resulted in the loss of millions of lives. In recent years, the emergence of the field of digital humanities has made the study of Holocaust testimonies an important area of research for historians, Holocaust educators, social scientists, and linguists. One of the challenges in analysing Holocaust testimonies is the recognition and categorisation of named entities such as concentration camps, military officers, ships, and ghettos, due to the scarcity of annotated data. This paper presents a research study on a domain-specific hybrid named-entity recognition model, which focuses on developing NER models specifically tailored for the Holocaust domain. To overcome the problem of data scarcity, we employed hybrid annotation approach to training different transformer model architectures in order to recognise the named entities. Results show transformer models to have good performance compared to other approaches.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122105646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts 历史密文手写体文本识别方法评价
Mohamed Ali Souibgui, Pau Torras, Jialuo Chen, A. Fornés
{"title":"An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts","authors":"Mohamed Ali Souibgui, Pau Torras, Jialuo Chen, A. Fornés","doi":"10.1145/3604951.3605509","DOIUrl":"https://doi.org/10.1145/3604951.3605509","url":null,"abstract":"This paper investigates the effectiveness of different deep learning HTR families, including LSTM, Seq2Seq, and transformer-based approaches with self-supervised pretraining, in recognizing ciphered manuscripts from different historical periods and cultures. The goal is to identify the most suitable method or training techniques for recognizing ciphered manuscripts and to provide insights into the challenges and opportunities in this field of research. We evaluate the performance of these models on several datasets of ciphered manuscripts and discuss their results. This study contributes to the development of more accurate and efficient methods for recognizing historical manuscripts for the preservation and dissemination of our cultural heritage.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128040705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信