{"title":"Layout Analysis on Challenging Historical Arabic Manuscripts using Siamese Network","authors":"Reem Alaasam, Berat Kurar, Jihad El-Sana","doi":"10.1109/ICDAR.2019.00123","DOIUrl":null,"url":null,"abstract":"This paper presents layout analysis for historical Arabic documents using siamese network. Given pages from different documents, we divide them into patches of similar sizes. We train a siamese network model that takes as an input a pair of patches and gives as an output a distance that corresponds to the similarity between the two patches. We used the trained model to calculate a distance matrix which in turn is used to cluster the patches of a page as either main text, side text or a background patch. We evaluate our method on challenging historical Arabic manuscripts dataset and report the F-measure. We show the effectiveness of our method by comparing with other works that use deep learning approaches, and show that we have state of art results.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"142 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
This paper presents layout analysis for historical Arabic documents using siamese network. Given pages from different documents, we divide them into patches of similar sizes. We train a siamese network model that takes as an input a pair of patches and gives as an output a distance that corresponds to the similarity between the two patches. We used the trained model to calculate a distance matrix which in turn is used to cluster the patches of a page as either main text, side text or a background patch. We evaluate our method on challenging historical Arabic manuscripts dataset and report the F-measure. We show the effectiveness of our method by comparing with other works that use deep learning approaches, and show that we have state of art results.