Mohammed Rais, Abdelmonaime Lachkar, Abdelhamid Lachkar, S. A. Ouatik
{"title":"A comparative study of biomedical named entity recognition methods based machine learning approach","authors":"Mohammed Rais, Abdelmonaime Lachkar, Abdelhamid Lachkar, S. A. Ouatik","doi":"10.1109/CIST.2014.7016641","DOIUrl":null,"url":null,"abstract":"Recognizing Biomedical Named Entities (BioNEs) such as genes, proteins, cells, drugs, diseases, etc. play a vital role in many Biomedical Text Mining applications. BioNER fall into five approaches: Dictionary-Based, Rule-Based, Machine-Learning-Based, Statistical-Based, and Hybrid-Based. Methods Based Machine Learning approach, are more effective than those of other approaches, and therefore have been widely used for learning to recognize BioNEs. In this paper, we present a comparative theoretical and experimental study between seven Machine Learning methods, by summarizing their advantages and weaknesses, and comparing their performance on two standard biomedical Corpora (GENIA and JNLPBA). The obtained results show that CRF outperforms all the other Machine-Learning methods on both corpora. That method (CRF) will be integrated in our future works.","PeriodicalId":106483,"journal":{"name":"2014 Third IEEE International Colloquium in Information Science and Technology (CIST)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Third IEEE International Colloquium in Information Science and Technology (CIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIST.2014.7016641","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Recognizing Biomedical Named Entities (BioNEs) such as genes, proteins, cells, drugs, diseases, etc. play a vital role in many Biomedical Text Mining applications. BioNER fall into five approaches: Dictionary-Based, Rule-Based, Machine-Learning-Based, Statistical-Based, and Hybrid-Based. Methods Based Machine Learning approach, are more effective than those of other approaches, and therefore have been widely used for learning to recognize BioNEs. In this paper, we present a comparative theoretical and experimental study between seven Machine Learning methods, by summarizing their advantages and weaknesses, and comparing their performance on two standard biomedical Corpora (GENIA and JNLPBA). The obtained results show that CRF outperforms all the other Machine-Learning methods on both corpora. That method (CRF) will be integrated in our future works.