{"title":"Integrating Bilingual Named Entities Lexicon with Conditional Random Fields Model for Arabic Named Entities Recognition","authors":"Emna Hkiri, Souheyl Mallat, M. Zrigui","doi":"10.1109/ICDAR.2017.105","DOIUrl":null,"url":null,"abstract":"Named Entity Recognition plays an important role in locating and classifying atomic elements into predefined categories such as person names, locations, organizations, expression of times, temporal expressions etc. Several approaches with rule based and machine learning based techniques have been applied on English and some other Latin languages successfully. Arabic has a complex and rich morphology, which makes the named entities recognition a challenging process. In this paper we propose our hybrid NER system that applies conditional random fields (CRF), bilingual NE lexicon and grammar rules to the task of Named Entity Recognition in Arabic languages. The aim of our system is enhancing the overall performance of NER tasks. The empirical results indicate that the hybrid system outperforms the state-of-the-art of Arabic NER in terms of precision when applied to ANERcorp dataset, with f-measures 83.36 for Person, 89.58for Location, and 72.26 for Organization","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2017.105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Named Entity Recognition plays an important role in locating and classifying atomic elements into predefined categories such as person names, locations, organizations, expression of times, temporal expressions etc. Several approaches with rule based and machine learning based techniques have been applied on English and some other Latin languages successfully. Arabic has a complex and rich morphology, which makes the named entities recognition a challenging process. In this paper we propose our hybrid NER system that applies conditional random fields (CRF), bilingual NE lexicon and grammar rules to the task of Named Entity Recognition in Arabic languages. The aim of our system is enhancing the overall performance of NER tasks. The empirical results indicate that the hybrid system outperforms the state-of-the-art of Arabic NER in terms of precision when applied to ANERcorp dataset, with f-measures 83.36 for Person, 89.58for Location, and 72.26 for Organization