阿拉伯语命名实体识别的有效性

Suhad Al-Shoukry, N. Omar
{"title":"阿拉伯语命名实体识别的有效性","authors":"Suhad Al-Shoukry, N. Omar","doi":"10.1109/ICEEI.2015.7352553","DOIUrl":null,"url":null,"abstract":"Named entry recognition research is a relatively new field for the Arabic language, although it has reached a mature stage for other languages. As Arabic has more speech sounds than many other languages, there is some lack of uniformity in Arabic writing styles. Transcription can become ambiguous, and the same word can be written in several different ways. Spelling mistakes can arise as a result of this same phenomenon. There are also both long and short vowels in Arabic, which can lead to further ambiguity. In the Arabic world, NER research has typically been of limited capacity or coverage. With this in mind, in this paper, we propose a method for analysing the structure of Arabic named-entity recognition and sentence object recognition by combining prior information and conditional random fields. We present a proposed method that leads to a 2.67% performance improvement per sentence, as compared with existing methods.","PeriodicalId":426454,"journal":{"name":"2015 International Conference on Electrical Engineering and Informatics (ICEEI)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficacy of Arabic named-entity recognition\",\"authors\":\"Suhad Al-Shoukry, N. Omar\",\"doi\":\"10.1109/ICEEI.2015.7352553\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Named entry recognition research is a relatively new field for the Arabic language, although it has reached a mature stage for other languages. As Arabic has more speech sounds than many other languages, there is some lack of uniformity in Arabic writing styles. Transcription can become ambiguous, and the same word can be written in several different ways. Spelling mistakes can arise as a result of this same phenomenon. There are also both long and short vowels in Arabic, which can lead to further ambiguity. In the Arabic world, NER research has typically been of limited capacity or coverage. With this in mind, in this paper, we propose a method for analysing the structure of Arabic named-entity recognition and sentence object recognition by combining prior information and conditional random fields. We present a proposed method that leads to a 2.67% performance improvement per sentence, as compared with existing methods.\",\"PeriodicalId\":426454,\"journal\":{\"name\":\"2015 International Conference on Electrical Engineering and Informatics (ICEEI)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Electrical Engineering and Informatics (ICEEI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEEI.2015.7352553\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Electrical Engineering and Informatics (ICEEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEEI.2015.7352553","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

命名条目识别的研究对于阿拉伯语来说是一个相对较新的领域,尽管对于其他语言来说已经达到了成熟的阶段。由于阿拉伯语比许多其他语言有更多的语音,因此阿拉伯语的写作风格缺乏一致性。抄写可能会变得模棱两可,同一个单词可能有几种不同的写法。同样的现象也会导致拼写错误。阿拉伯语中也有长元音和短元音,这可能会导致进一步的歧义。在阿拉伯世界,NER研究的能力或覆盖范围通常有限。有鉴于此,本文提出了一种结合先验信息和条件随机场的阿拉伯语命名实体识别和句子对象识别结构分析方法。与现有方法相比,我们提出的方法每个句子的性能提高了2.67%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficacy of Arabic named-entity recognition
Named entry recognition research is a relatively new field for the Arabic language, although it has reached a mature stage for other languages. As Arabic has more speech sounds than many other languages, there is some lack of uniformity in Arabic writing styles. Transcription can become ambiguous, and the same word can be written in several different ways. Spelling mistakes can arise as a result of this same phenomenon. There are also both long and short vowels in Arabic, which can lead to further ambiguity. In the Arabic world, NER research has typically been of limited capacity or coverage. With this in mind, in this paper, we propose a method for analysing the structure of Arabic named-entity recognition and sentence object recognition by combining prior information and conditional random fields. We present a proposed method that leads to a 2.67% performance improvement per sentence, as compared with existing methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信