{"title":"An effective extension to okapi for biomedical text mining","authors":"Ming Zhong, Xiangji Huang","doi":"10.1109/GRC.2006.1635878","DOIUrl":null,"url":null,"abstract":"In biomedical text mining domain, a challenging problem is to identify the biological entity which has multiple forms of name. For this reason, the traditional IR system usually does not have a good performance. We propose an extension to Okapi information retrieval system so that it has the ability to identify the biological entity with multiple lexical variants. This extension integrates the Okapi system, an automatic query expansion algorithm and a new method for transforming a topic written in natural language into a structured query. Experiments on both 2004 and 2005 TREC Genomics data sets show that the proposed extension to Okapi is effective and competitive.","PeriodicalId":400997,"journal":{"name":"2006 IEEE International Conference on Granular Computing","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Conference on Granular Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GRC.2006.1635878","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In biomedical text mining domain, a challenging problem is to identify the biological entity which has multiple forms of name. For this reason, the traditional IR system usually does not have a good performance. We propose an extension to Okapi information retrieval system so that it has the ability to identify the biological entity with multiple lexical variants. This extension integrates the Okapi system, an automatic query expansion algorithm and a new method for transforming a topic written in natural language into a structured query. Experiments on both 2004 and 2005 TREC Genomics data sets show that the proposed extension to Okapi is effective and competitive.