{"title":"PubMed中个性化检索文本分类方法的比较研究","authors":"Sachintha Pitigala, Cen Li, S. Seo","doi":"10.1109/BIBMW.2011.6112503","DOIUrl":null,"url":null,"abstract":"Retrieval of the information relevant to one's need from PubMed is becoming increasingly challenging due to its large volume and rapid growth. The traditional information search techniques based on keyword matching are insufficient for large databases such as PubMed. A personalized article retrieval system that is tailored to individual researcher's specific interests and selects only highly relevant articles can be a helpful tool in the field of Bioinformatics. The text classification methods developed in the text mining community have shown good results in differentiating relevant articles from the irrelevant ones. This study compares two text classification methods, Naïve Bayes and Support Vector Machines, in order to study the effectiveness of the two methods on classifying full text articles in the case when only a small set of training data is available. The comparison results show that the Naïve Bayes method is a better choice than Support Vector Machines in building a personalized article retrieval system which can learn (train) from a small set of full text articles.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"67 1","pages":"919-921"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A comparative study of text classification approaches for personalized retrieval in PubMed\",\"authors\":\"Sachintha Pitigala, Cen Li, S. Seo\",\"doi\":\"10.1109/BIBMW.2011.6112503\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Retrieval of the information relevant to one's need from PubMed is becoming increasingly challenging due to its large volume and rapid growth. The traditional information search techniques based on keyword matching are insufficient for large databases such as PubMed. A personalized article retrieval system that is tailored to individual researcher's specific interests and selects only highly relevant articles can be a helpful tool in the field of Bioinformatics. The text classification methods developed in the text mining community have shown good results in differentiating relevant articles from the irrelevant ones. This study compares two text classification methods, Naïve Bayes and Support Vector Machines, in order to study the effectiveness of the two methods on classifying full text articles in the case when only a small set of training data is available. The comparison results show that the Naïve Bayes method is a better choice than Support Vector Machines in building a personalized article retrieval system which can learn (train) from a small set of full text articles.\",\"PeriodicalId\":6345,\"journal\":{\"name\":\"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)\",\"volume\":\"67 1\",\"pages\":\"919-921\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBMW.2011.6112503\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBMW.2011.6112503","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A comparative study of text classification approaches for personalized retrieval in PubMed
Retrieval of the information relevant to one's need from PubMed is becoming increasingly challenging due to its large volume and rapid growth. The traditional information search techniques based on keyword matching are insufficient for large databases such as PubMed. A personalized article retrieval system that is tailored to individual researcher's specific interests and selects only highly relevant articles can be a helpful tool in the field of Bioinformatics. The text classification methods developed in the text mining community have shown good results in differentiating relevant articles from the irrelevant ones. This study compares two text classification methods, Naïve Bayes and Support Vector Machines, in order to study the effectiveness of the two methods on classifying full text articles in the case when only a small set of training data is available. The comparison results show that the Naïve Bayes method is a better choice than Support Vector Machines in building a personalized article retrieval system which can learn (train) from a small set of full text articles.