Proceedings of the 7th International Workshop on Historical Document Imaging and Processing最新文献

Investigations on Self-supervised Learning for Script-, Font-type, and Location Classification on Historical Documents 历史文献文字、字型、位置分类的自监督学习研究

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing Pub Date : 2023-08-25 DOI: 10.1145/3604951.3605519

Johan Zenk, Florian Kordon, Martin Mayr, Mathias Seuret, V. Christlein

{"title":"Investigations on Self-supervised Learning for Script-, Font-type, and Location Classification on Historical Documents","authors":"Johan Zenk, Florian Kordon, Martin Mayr, Mathias Seuret, V. Christlein","doi":"10.1145/3604951.3605519","DOIUrl":"https://doi.org/10.1145/3604951.3605519","url":null,"abstract":"In the context of automated classification of historical documents, we investigate three contemporary self-supervised learning (SSL) techniques (SimSiam, Dino, and VICReg) for the pre-training of three different document analysis tasks, namely script-type, font-type, and location classification. Our study draws samples from multiple datasets that contain images of manuscripts, prints, charters, and letters. The representations derived via pre-text training are taken as inputs for k-NN classification and a parametric linear classifier. The latter is placed atop the pre-trained backbones to enable fine-tuning of the entire network to further improve the classification by exploiting task-specific label data. The network’s final performance is assessed via independent test sets obtained from the ICDAR2021 Competition on Historical Document Classification. We empirically show that representations learned with SSL are significantly better suited for subsequent document classification than features generated by commonly used transfer learning on ImageNet.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121022906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Gauging the Limitations of Natural Language Supervised Text-Image Metrics Learning by Iconclass Visual Concepts 用Iconclass视觉概念衡量自然语言监督的文本-图像度量学习的局限性

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing Pub Date : 2023-08-25 DOI: 10.1145/3604951.3605516

Kai Labusch, Clemens Neudecker

引用次数: 0

Two-step sequence transformer based method for Cham to Latin script transliteration 基于两步序列变换的占语到拉丁字母音译方法

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing Pub Date : 2023-08-25 DOI: 10.1145/3604951.3605525

Tien-Nam Nguyen, J. Burie, Thi-Lan Le, Anne-Valérie Schweyer

引用次数: 0

Classifying The Scripts of Aramaic Incantation Bowls 亚拉姆语咒语碗的文字分类

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing Pub Date : 2023-08-25 DOI: 10.1145/3604951.3605510

Said Naamneh, Nour Atamni, Boraq Madi, Daria Vasyutinsky Shapira, Irina Rabaev, Jihad El-Sana, Shoshana Boardman

{"title":"Classifying The Scripts of Aramaic Incantation Bowls","authors":"Said Naamneh, Nour Atamni, Boraq Madi, Daria Vasyutinsky Shapira, Irina Rabaev, Jihad El-Sana, Shoshana Boardman","doi":"10.1145/3604951.3605510","DOIUrl":"https://doi.org/10.1145/3604951.3605510","url":null,"abstract":"Aramaic incantation bowls are a magical object commonly used in Sasanian Mesopotamia (the region that includes modern-day Iraq and Iran) between the 4th and 7th centuries CE. These bowls were typically made of clay and inscribed with incantations in three dialects of Aramaic, the languages widely spoken in the region then. This paper focuses on bowls written in Jewish Babylonian Aramaic. The purpose of these bowls was to protect the homes of their owners from evil spirits and demons. The inscriptions on the bowls were most often written in a spiral fashion and often included the names of various demons and invocations of protective spirits and angels, alongside the names and family relationships of the clients, Biblical quotations, and other interesting material. The bowls were buried upside down beneath the floor of a home so that the incantations faced downward towards the underworld. This study tackles the problem of automatic classification of the script style of incantation bowls. To this end, we prepare and introduce a new dataset of incantation bowl images from the 4th to 7th centuries CE. We experiment with and compare several Siamese-based architectures, and introduce a new Multi-Level-of-Detail architecture, which extracts features at different scales. Our results establish baselines for future research and make valuable contributions to ongoing research addressing challenges in working with ancient artifact images.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123772409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Study of historical Byzantine seal images: the BHAI project for computer-based sigillography 历史拜占庭印章图像的研究:基于计算机的符号学的BHAI项目

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing Pub Date : 2023-08-25 DOI: 10.1145/3604951.3605523

Victoria Eyharabide, Laurence Likforman-Sulem, Lucia Maria Orlandi, Alexandre Binoux, Theophile Rageau, Qijia Huang, A. Fiandrotti, Beatrice Caseau, Isabelle Bloch

引用次数: 0

A hybrid CNN-Transformer model for Historical Document Image Binarization 历史文献图像二值化的CNN-Transformer混合模型

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing Pub Date : 2023-08-25 DOI: 10.1145/3604951.3605508

V. Rezanezhad, Konstantin Baierer, Clemens Neudecker

引用次数: 1

Homer restored: Virtual reconstruction of Papyrus Bodmer 1 荷马还原:莎草纸的虚拟重建1

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing Pub Date : 2023-08-25 DOI: 10.1145/3604951.3605518

Simon Perrin, Léopold Cudilla, Yejing Xie, H. Mouchère, Isabelle Marthot-Santaniello

引用次数: 0

NAME – A Rich XML Format for Named Entity and Relation Tagging 用于命名实体和关系标记的富XML格式

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing Pub Date : 2023-08-25 DOI: 10.1145/3604951.3605521

C. Clausner, S. Pletschacher, A. Antonacopoulos

引用次数: 0

Enhancing Named Entity Recognition for Holocaust Testimonies through Pseudo Labelling and Transformer-based Models 通过伪标签和基于变压器的模型增强大屠杀证词的命名实体识别

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing Pub Date : 2023-08-25 DOI: 10.1145/3604951.3605514

Isuri Anuradha Nanomi Arachchige, L. Ha, R. Mitkov, Johannes-Dieter Steinert

引用次数: 1

An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts 历史密文手写体文本识别方法评价

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing Pub Date : 2023-08-25 DOI: 10.1145/3604951.3605509

Mohamed Ali Souibgui, Pau Torras, Jialuo Chen, A. Fornés

引用次数: 0