{"title":"阿拉伯医学文本的命名实体识别和信息提取","authors":"Jaafar Hammoud, N. Dobrenko, N. Gusarova","doi":"10.33965/eh2020_202009l015","DOIUrl":null,"url":null,"abstract":"The article discusses the possibilities of solving NER (Named Entity Recognition) problem for medical texts in Arabic with limited availability of labeled datasets, as well as computational and specialized linguistic resources. To overcome them, it is proposed to use recurrent neural networks. In our experiments, we used \"BERT-Base, Multilingual Cased\" from Google and Pooled-GRU with Multi-lingual Universal Sentence Encoder (MUSE) from Facebook. Each network was fine-tuned with our dataset. The used dataset was obtained from three medical volumes issued by Arabic Encyclopedia. We experimentally evaluated the effectiveness of tuned models on real NLP (Natural Language Processing) task - medical entities recognition from the Arabic Medical Encyclopedia and obtained encouraging results.","PeriodicalId":393647,"journal":{"name":"Proceedings of the 12th International Conference on e-Health (EH2020)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"NAMED ENTITY RECOGNITION AND INFORMATION EXTRACTION FOR ARABIC MEDICAL TEXT\",\"authors\":\"Jaafar Hammoud, N. Dobrenko, N. Gusarova\",\"doi\":\"10.33965/eh2020_202009l015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The article discusses the possibilities of solving NER (Named Entity Recognition) problem for medical texts in Arabic with limited availability of labeled datasets, as well as computational and specialized linguistic resources. To overcome them, it is proposed to use recurrent neural networks. In our experiments, we used \\\"BERT-Base, Multilingual Cased\\\" from Google and Pooled-GRU with Multi-lingual Universal Sentence Encoder (MUSE) from Facebook. Each network was fine-tuned with our dataset. The used dataset was obtained from three medical volumes issued by Arabic Encyclopedia. We experimentally evaluated the effectiveness of tuned models on real NLP (Natural Language Processing) task - medical entities recognition from the Arabic Medical Encyclopedia and obtained encouraging results.\",\"PeriodicalId\":393647,\"journal\":{\"name\":\"Proceedings of the 12th International Conference on e-Health (EH2020)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 12th International Conference on e-Health (EH2020)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33965/eh2020_202009l015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th International Conference on e-Health (EH2020)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33965/eh2020_202009l015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
NAMED ENTITY RECOGNITION AND INFORMATION EXTRACTION FOR ARABIC MEDICAL TEXT
The article discusses the possibilities of solving NER (Named Entity Recognition) problem for medical texts in Arabic with limited availability of labeled datasets, as well as computational and specialized linguistic resources. To overcome them, it is proposed to use recurrent neural networks. In our experiments, we used "BERT-Base, Multilingual Cased" from Google and Pooled-GRU with Multi-lingual Universal Sentence Encoder (MUSE) from Facebook. Each network was fine-tuned with our dataset. The used dataset was obtained from three medical volumes issued by Arabic Encyclopedia. We experimentally evaluated the effectiveness of tuned models on real NLP (Natural Language Processing) task - medical entities recognition from the Arabic Medical Encyclopedia and obtained encouraging results.