{"title":"Rules-based grammatical and semantic disambiguation of the token \"hattā\" in Arabic","authors":"Dhaou Ghoul, A. Ibrahim, C. Audebert","doi":"10.1109/ICTA.2015.7426889","DOIUrl":null,"url":null,"abstract":"In this paper, we present a method of grammatical and semantic disambiguation of the particle or the token \"hatta̅\" in Arabic language. This method is based on a thorough analysis of the context. Our goal is to achieve the maximum linguistic information of this token thanks to a corpus in order to modeling as a grammar or rules. To do this, we first developed a corpus that contains the different contexts of the token \"hatta̅\". Second from this corpus, we identified the different linguistic criteria of this token that allow us to correctly identify it. Finally, we codified this information in the form of linguistic rules in order to detect it easily by machine.","PeriodicalId":375443,"journal":{"name":"2015 5th International Conference on Information & Communication Technology and Accessibility (ICTA)","volume":"45 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 5th International Conference on Information & Communication Technology and Accessibility (ICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTA.2015.7426889","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we present a method of grammatical and semantic disambiguation of the particle or the token "hatta̅" in Arabic language. This method is based on a thorough analysis of the context. Our goal is to achieve the maximum linguistic information of this token thanks to a corpus in order to modeling as a grammar or rules. To do this, we first developed a corpus that contains the different contexts of the token "hatta̅". Second from this corpus, we identified the different linguistic criteria of this token that allow us to correctly identify it. Finally, we codified this information in the form of linguistic rules in order to detect it easily by machine.