An hybrid approach to improve part of speech tagging system

S. Farrah, Hanane El Manssouri, E. Ziyati, M. Ouzzif
{"title":"An hybrid approach to improve part of speech tagging system","authors":"S. Farrah, Hanane El Manssouri, E. Ziyati, M. Ouzzif","doi":"10.1109/ISACV.2018.8354032","DOIUrl":null,"url":null,"abstract":"Platforms interacting with data in text format, such as social networks or search engines, face major challenges regarding this flow of texts such as storage, search and information processing. New disciplines have emerged as natural language processing that involve identifying all aspects of language (spoken or written). In this perspective, we focus on the aspect of part-of speech (POS) tagging applied to the Arabic language which consists in marking each word in the text with its good tag. One of the most difficult problems affecting POS tagging is the ambiguity of the text. Ambiguity is the most important problem in the natural language processing. We propose a rule-based hybrid approach with an artificial neural network classifier to determine the appropriate tags of an Arabic text. The first phase consists of extracting all the affixes to identify the nature of the word and its tags according to grammatical rules, the second phase begins by transliterating the Arabic text into text with Roman letters. The transliterated text is then transformed into digital vectors to form the input of the classifier based on the neural networks. The two phases are combined to identify the tag of each word.","PeriodicalId":184662,"journal":{"name":"2018 International Conference on Intelligent Systems and Computer Vision (ISCV)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Intelligent Systems and Computer Vision (ISCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISACV.2018.8354032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Platforms interacting with data in text format, such as social networks or search engines, face major challenges regarding this flow of texts such as storage, search and information processing. New disciplines have emerged as natural language processing that involve identifying all aspects of language (spoken or written). In this perspective, we focus on the aspect of part-of speech (POS) tagging applied to the Arabic language which consists in marking each word in the text with its good tag. One of the most difficult problems affecting POS tagging is the ambiguity of the text. Ambiguity is the most important problem in the natural language processing. We propose a rule-based hybrid approach with an artificial neural network classifier to determine the appropriate tags of an Arabic text. The first phase consists of extracting all the affixes to identify the nature of the word and its tags according to grammatical rules, the second phase begins by transliterating the Arabic text into text with Roman letters. The transliterated text is then transformed into digital vectors to form the input of the classifier based on the neural networks. The two phases are combined to identify the tag of each word.
一种改进词性标注系统的混合方法
与文本格式的数据交互的平台,如社交网络或搜索引擎,面临着关于文本流的主要挑战,如存储、搜索和信息处理。新的学科如自然语言处理已经出现,涉及识别语言的各个方面(口语或书面语)。从这个角度来看,我们关注的是词性标注(POS)在阿拉伯语中的应用,即在文本中为每个单词标记好词性标注。影响词性标注的最困难的问题之一是文本的歧义。歧义是自然语言处理中的一个重要问题。我们提出了一种基于规则的混合方法与人工神经网络分类器来确定阿拉伯语文本的适当标签。第一阶段是根据语法规则提取词缀来识别单词的性质及其标签,第二阶段是将阿拉伯语文本音译为罗马字母文本。然后将音译后的文本转换成数字向量,形成基于神经网络的分类器的输入。这两个阶段相结合,以确定每个词的标签。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信