ALIF editor for generating Arabic normalized lexicons

Samia Ben Ismail, Hajer Maraoui, K. Haddar, Laurent Romary
{"title":"ALIF editor for generating Arabic normalized lexicons","authors":"Samia Ben Ismail, Hajer Maraoui, K. Haddar, Laurent Romary","doi":"10.1109/IACS.2017.7921948","DOIUrl":null,"url":null,"abstract":"The development of a normalized morpho-syntactic Arabic lexicon is not an easy task. In fact, many norms allow the structuration and representation of lexical data. The adoption of a stable standard will guarantee the interoperability and interchangeability of lexical resources. Still, research work that deals with normalization for Arabic lexical resources is not well developed yet, especially for some standards such as the TEI (Text Encoding Initiative). In this context, we aim at creating an Arabic lexicon editor with a constraint checker based on both the ISO standard LMF (Lexical Markup Framework) and the TEI guidelines. To develop this editor, we use a linguistic approach composed of several steps. The editor's prototype named ALIF can guarantee the construction of two types of output lexicon files: one in LMF and the other in TEI. The evaluation of this system is based upon a lexical database that contains all the derived and inflected forms generated from a lexicon of 10 000 canonical verbs. The results obtained were encouraging despite some flaws related to exceptional cases of difficult words.","PeriodicalId":180504,"journal":{"name":"2017 8th International Conference on Information and Communication Systems (ICICS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 8th International Conference on Information and Communication Systems (ICICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IACS.2017.7921948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

The development of a normalized morpho-syntactic Arabic lexicon is not an easy task. In fact, many norms allow the structuration and representation of lexical data. The adoption of a stable standard will guarantee the interoperability and interchangeability of lexical resources. Still, research work that deals with normalization for Arabic lexical resources is not well developed yet, especially for some standards such as the TEI (Text Encoding Initiative). In this context, we aim at creating an Arabic lexicon editor with a constraint checker based on both the ISO standard LMF (Lexical Markup Framework) and the TEI guidelines. To develop this editor, we use a linguistic approach composed of several steps. The editor's prototype named ALIF can guarantee the construction of two types of output lexicon files: one in LMF and the other in TEI. The evaluation of this system is based upon a lexical database that contains all the derived and inflected forms generated from a lexicon of 10 000 canonical verbs. The results obtained were encouraging despite some flaws related to exceptional cases of difficult words.
用于生成阿拉伯语规范化词汇的ALIF编辑器
发展一个规范化的形态句法阿拉伯语词汇不是一件容易的事。事实上,许多规范都允许对词法数据进行结构化和表示。采用稳定的标准将保证词汇资源的互操作性和互换性。然而,处理阿拉伯语词汇资源规范化的研究工作还没有得到很好的发展,特别是对于一些标准,如TEI (Text Encoding Initiative)。在此上下文中,我们的目标是创建一个带有约束检查器的阿拉伯语词典编辑器,该约束检查器基于ISO标准LMF(词法标记框架)和TEI指南。为了开发这个编辑器,我们使用了由几个步骤组成的语言方法。名为ALIF的编辑器原型可以保证构建两种类型的输出词典文件:一种在LMF中,另一种在TEI中。该系统的评估是基于一个词汇数据库,该数据库包含从10,000个规范动词的词典中生成的所有派生和屈折形式。结果令人鼓舞,尽管有一些与特殊情况下的困难单词有关的缺陷。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信