{"title":"Database of Handwritten Arabic Mathematical Formula Images","authors":"Ibtissem Hadj Ali, M. Mahjoub","doi":"10.1109/CGIV.2016.36","DOIUrl":null,"url":null,"abstract":"Although publicly available, ground-truthed database have proven useful for training, evaluating, and comparing recognition systems in many domains, the availability of such database for handwritten Arabic mathematical formula recognition in particular, is currently quite poor. In this paper, we present a new public database that contains off-line handwritten mathematical expressions. We describe in this paper the different steps to acquire this database, from the collection of the mathematical expression corpora to the transcription of the collected data. Actually, the database contains 4 238 off-line handwritten mathematical expressions written by 66 writers and 20 300 handwritten isolated symbol images. The ground truth is also presented for the handwritten expressions as XML files with the number of symbols, and the MATHML structure.","PeriodicalId":351561,"journal":{"name":"2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CGIV.2016.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Although publicly available, ground-truthed database have proven useful for training, evaluating, and comparing recognition systems in many domains, the availability of such database for handwritten Arabic mathematical formula recognition in particular, is currently quite poor. In this paper, we present a new public database that contains off-line handwritten mathematical expressions. We describe in this paper the different steps to acquire this database, from the collection of the mathematical expression corpora to the transcription of the collected data. Actually, the database contains 4 238 off-line handwritten mathematical expressions written by 66 writers and 20 300 handwritten isolated symbol images. The ground truth is also presented for the handwritten expressions as XML files with the number of symbols, and the MATHML structure.