{"title":"Arabic Words Stemming Approach Using Arabic Wordnet","authors":"Abdel Hamid Kreaa, Ahmad S Ahmad, Kassem Kabalan","doi":"10.5121/IJDKP.2014.4601","DOIUrl":null,"url":null,"abstract":"The big growth of the Arabic internet content in the last years has raised up the need for an effective stemming techniques for Arabic language. Arabic stemming algorithms can be ranked, according to three category, as root-based approach (ex. Khoja); stem-based approach (ex. Larkey); and statistical approach (ex. N-Garm). However, no stemming of this language is perfect: The existing stemmers have a low efficiency. In this paper, we introduce a new stemming technique for Arabic words that also solve the problem of the plural form of irregular nouns in Arabic language, which called broken plural. The proposed stem extractor provides very accurate results in comparisons with other algorithms. Consequently the search effectiveness improved.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"4 5","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Mining & Knowledge Management Process","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/IJDKP.2014.4601","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
The big growth of the Arabic internet content in the last years has raised up the need for an effective stemming techniques for Arabic language. Arabic stemming algorithms can be ranked, according to three category, as root-based approach (ex. Khoja); stem-based approach (ex. Larkey); and statistical approach (ex. N-Garm). However, no stemming of this language is perfect: The existing stemmers have a low efficiency. In this paper, we introduce a new stemming technique for Arabic words that also solve the problem of the plural form of irregular nouns in Arabic language, which called broken plural. The proposed stem extractor provides very accurate results in comparisons with other algorithms. Consequently the search effectiveness improved.