{"title":"集成双语命名实体词典和条件随机场模型的阿拉伯语命名实体识别","authors":"Emna Hkiri, Souheyl Mallat, M. Zrigui","doi":"10.1109/ICDAR.2017.105","DOIUrl":null,"url":null,"abstract":"Named Entity Recognition plays an important role in locating and classifying atomic elements into predefined categories such as person names, locations, organizations, expression of times, temporal expressions etc. Several approaches with rule based and machine learning based techniques have been applied on English and some other Latin languages successfully. Arabic has a complex and rich morphology, which makes the named entities recognition a challenging process. In this paper we propose our hybrid NER system that applies conditional random fields (CRF), bilingual NE lexicon and grammar rules to the task of Named Entity Recognition in Arabic languages. The aim of our system is enhancing the overall performance of NER tasks. The empirical results indicate that the hybrid system outperforms the state-of-the-art of Arabic NER in terms of precision when applied to ANERcorp dataset, with f-measures 83.36 for Person, 89.58for Location, and 72.26 for Organization","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Integrating Bilingual Named Entities Lexicon with Conditional Random Fields Model for Arabic Named Entities Recognition\",\"authors\":\"Emna Hkiri, Souheyl Mallat, M. Zrigui\",\"doi\":\"10.1109/ICDAR.2017.105\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Named Entity Recognition plays an important role in locating and classifying atomic elements into predefined categories such as person names, locations, organizations, expression of times, temporal expressions etc. Several approaches with rule based and machine learning based techniques have been applied on English and some other Latin languages successfully. Arabic has a complex and rich morphology, which makes the named entities recognition a challenging process. In this paper we propose our hybrid NER system that applies conditional random fields (CRF), bilingual NE lexicon and grammar rules to the task of Named Entity Recognition in Arabic languages. The aim of our system is enhancing the overall performance of NER tasks. The empirical results indicate that the hybrid system outperforms the state-of-the-art of Arabic NER in terms of precision when applied to ANERcorp dataset, with f-measures 83.36 for Person, 89.58for Location, and 72.26 for Organization\",\"PeriodicalId\":433676,\"journal\":{\"name\":\"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2017.105\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2017.105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Integrating Bilingual Named Entities Lexicon with Conditional Random Fields Model for Arabic Named Entities Recognition
Named Entity Recognition plays an important role in locating and classifying atomic elements into predefined categories such as person names, locations, organizations, expression of times, temporal expressions etc. Several approaches with rule based and machine learning based techniques have been applied on English and some other Latin languages successfully. Arabic has a complex and rich morphology, which makes the named entities recognition a challenging process. In this paper we propose our hybrid NER system that applies conditional random fields (CRF), bilingual NE lexicon and grammar rules to the task of Named Entity Recognition in Arabic languages. The aim of our system is enhancing the overall performance of NER tasks. The empirical results indicate that the hybrid system outperforms the state-of-the-art of Arabic NER in terms of precision when applied to ANERcorp dataset, with f-measures 83.36 for Person, 89.58for Location, and 72.26 for Organization