2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR)最新文献_第3页

Generic method for grid line detection and removal in scanned documents 扫描文件中网格线检测和去除的通用方法

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR) Pub Date : 2018-03-01 DOI: 10.1109/ASAR.2018.8480217

Romain Karpinski, A. Belaïd

{"title":"Generic method for grid line detection and removal in scanned documents","authors":"Romain Karpinski, A. Belaïd","doi":"10.1109/ASAR.2018.8480217","DOIUrl":"https://doi.org/10.1109/ASAR.2018.8480217","url":null,"abstract":"The detection and extraction of writing grid lines (WGL) in document images is an important task for a wide variety of systems. It is a pre-processing operation that tries to clean up the document image to make the recognition process easier. A lot of work has been proposed for staff line extraction in the context of Optical Music Recognition. Two competitions have been recently proposed in the 2011 and the 2013 ICDAR/GREC conferences. The method proposed in this paper aims to remove WGL without degrading the content. The whole method is based on the estimation of line_space (inter) and line_height and the use of run-length segments to locate WGL points. These points are then grouped together to form larger lines. Missing points are estimated by using a linear model and the context of other adjacent lines. We show that our method does not rely on the writing nature: printed or handwritten nor the language: musical symbols, Latin or Arabic writings. The results obtained are close to the state-of-the-art on not deformed documents. Furthermore, our method performs better than the ones that we have tested (at our disposal) on our image grid datasets.","PeriodicalId":165564,"journal":{"name":"2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127332151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MHDID: A Multi-distortion Historical Document Image Database MHDID:一个多失真历史文档图像数据库

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR) Pub Date : 2018-03-01 DOI: 10.1109/ASAR.2018.8480372

Atena Shahkolaei, Azeddine Beghdadi, S. Al-Maadeed, M. Cheriet

引用次数: 4

A Novel Term Weighting Scheme and an Approach for Classification of Agricultural Arabic Text Complaints 一种新的阿拉伯语农业文本投诉术语加权方案及分类方法

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR) Pub Date : 2018-03-01 DOI: 10.1109/ASAR.2018.8480317

D. S. Guru, Mostafa Ali, M. Suhil

{"title":"A Novel Term Weighting Scheme and an Approach for Classification of Agricultural Arabic Text Complaints","authors":"D. S. Guru, Mostafa Ali, M. Suhil","doi":"10.1109/ASAR.2018.8480317","DOIUrl":"https://doi.org/10.1109/ASAR.2018.8480317","url":null,"abstract":"In this paper, a machine learning based approach for classification of farmers’ complaints which are in Arabic text into different crops has been proposed. Initially, the complaints are preprocessed using stop word removal, auto correction of words, handling some special cases and stemming to extract only the content terms. Some of the domain specific special cases which may affect the classification performance are handled. A new term weighting scheme called Term Class Weight-Inverse Class Frequency (TCW-ICF) is then used to extract the most discriminating features with respect to each class. The extracted features are then used to represent the preprocessed complaints in the form of feature vectors for training a classifier. Finally, an unlabeled complaint is classified as a member of one of the crop classes by the trained classifier. Nevertheless, a relatively large dataset consisting of more than 5000 complaints of the farmers described in Arabic script from eight different crops has been created. The proposed approach has been experimentally validated by conducting an extensive experimentation on the newly created dataset using KNN classifier. It has been argued that the proposed outperforms the baseline Vector Space Model (VSM). Further, the superiority of the proposed term weighting scheme in selecting the best set of discriminating features has been demonstrated through a comparative analysis against four well-known feature selection techniques. The new term is applied on Arabic script as a case study but it can be applied on any text data from any language.","PeriodicalId":165564,"journal":{"name":"2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR)","volume":"197 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116393482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Smartphone Arabic Signboards Images Reading 智能手机阿拉伯语招牌图像阅读

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR) Pub Date : 2018-03-01 DOI: 10.1109/ASAR.2018.8480171

S. Snoussi

引用次数: 1

Data Collection and Image Processing System for Ancient Arabic Manuscripts 古阿拉伯语手稿数据采集与图像处理系统

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR) Pub Date : 2018-03-01 DOI: 10.1109/ASAR.2018.8480251

S. Al-Maadeed, Syed F. K. Peer, Nandhini Subramanian

引用次数: 0

Case Study: Fine Writing Style Classification Using Siamese Neural Network 案例研究:使用暹罗神经网络进行精细写作风格分类

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR) Pub Date : 2018-03-01 DOI: 10.1109/ASAR.2018.8480212

Alaa Abdalhaleem, Berat Kurar Barakat, Jihad El-Sana

引用次数: 10

A Path Signature Approach to Online Arabic Handwriting Recognition 一种路径签名方法用于在线阿拉伯手写识别

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR) Pub Date : 2018-03-01 DOI: 10.1109/ASAR.2018.8480300

Daniel Wilson-Nunn, Terry Lyons, A. Papavasiliou, Hao Ni

引用次数: 13

Deep FCN for Arabic Scene Text Detection 用于阿拉伯语场景文本检测的深度FCN

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR) Pub Date : 2018-03-01 DOI: 10.1109/ASAR.2018.8480394

I. Beltaief, Mohamed Ben Halima

引用次数: 2

How To Efficiently Increase Resolution in Neural OCR Models 如何有效提高神经OCR模型的分辨率

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR) Pub Date : 2018-03-01 DOI: 10.1109/ASAR.2018.8480182

Stephen Rawls, Huaigu Cao, Joe Mathai, P. Natarajan

引用次数: 2

ASAR 2018 Competition Page Layout Analysis Using Fully Convolutional Networks 基于全卷积网络的ASAR 2018竞赛页面布局分析

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR) Pub Date : 2018-03-01 DOI: 10.1109/ASAR.2018.8480326

Ahmad Droby, Berat Kurar Barakat, Jihad El-Sana

引用次数: 1