{"title":"Event extraction from biomedical text using CRF and genetic algorithm","authors":"A. Majumder, Asif Ekbal","doi":"10.1109/C3IT.2015.7060131","DOIUrl":null,"url":null,"abstract":"The main aim of biomedicai information extraction is to capture biomedicai phenomena from textual data by extracting relevant entities, information and relations between biomedicai entities (i.e. proteins and genes). In the recent past the focus is shifted towards extraction of more complex relations in the form of bio-molecular events that may include several entities or other relations. In this paper we propose a supervised machine learning approach based on Conditional Random Field (CRF) using Genetic Algorithm (GA) to detect events, classify them into some predefined categories of interest and to determine the arguments of the events. We implement a set of statistical and linguistic features that represent various morphological, syntactic and contextual information of the bio-molecular trigger words. Experiments using 5-fold cross validation demonstrate the recall, precision and F-measure values of 36.52%, 59.72% and 45.33%, respectively.","PeriodicalId":402311,"journal":{"name":"Proceedings of the 2015 Third International Conference on Computer, Communication, Control and Information Technology (C3IT)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 Third International Conference on Computer, Communication, Control and Information Technology (C3IT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/C3IT.2015.7060131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The main aim of biomedicai information extraction is to capture biomedicai phenomena from textual data by extracting relevant entities, information and relations between biomedicai entities (i.e. proteins and genes). In the recent past the focus is shifted towards extraction of more complex relations in the form of bio-molecular events that may include several entities or other relations. In this paper we propose a supervised machine learning approach based on Conditional Random Field (CRF) using Genetic Algorithm (GA) to detect events, classify them into some predefined categories of interest and to determine the arguments of the events. We implement a set of statistical and linguistic features that represent various morphological, syntactic and contextual information of the bio-molecular trigger words. Experiments using 5-fold cross validation demonstrate the recall, precision and F-measure values of 36.52%, 59.72% and 45.33%, respectively.