biotextretriver:检索相关论文的工具

Int. J. Knowl. Discov. Bioinform. Pub Date : 2011-07-01 DOI:10.4018/jkdb.2011070102

Célia Talma Gonçalves, Rui Camacho, Eugénio C. Oliveira

{"title":"biotextretriver:检索相关论文的工具","authors":"Célia Talma Gonçalves, Rui Camacho, Eugénio C. Oliveira","doi":"10.4018/jkdb.2011070102","DOIUrl":null,"url":null,"abstract":"Whenever new sequences of DNA or proteins have been decoded it is almost compulsory to look at similar sequences and papers describing those sequences in order to both collect relevant information concerning the function and activity of the new sequences and/or know what is known already about similar sequences. In current web sites and data bases of sequences there are, usually, a set of curated paper references linked to each sequence. Those links are a good starting point to look for relevant information related to a set of sequences. One way to implement such approach is to do a blast with the new decoded sequences, and collect similar sequences. Then one looks at the papers linked with the similar sequences. Most often the number of retrieved papers is small and one has to search large data bases for relevant papers. This paper proposes a process of generating a classifier based on the initially set of relevant papers. First, the authors collect similar sequences using an alignment algorithm like Blast. Then, the authors use the enlarges set of papers to construct a classifier. Finally a classifier is used to automatically enlarge the set of relevant papers by searching the MEDLINE using the automatically constructed classifier.","PeriodicalId":160270,"journal":{"name":"Int. J. Knowl. Discov. Bioinform.","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"BioTextRetriever: A Tool to Retrieve Relevant Papers\",\"authors\":\"Célia Talma Gonçalves, Rui Camacho, Eugénio C. Oliveira\",\"doi\":\"10.4018/jkdb.2011070102\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Whenever new sequences of DNA or proteins have been decoded it is almost compulsory to look at similar sequences and papers describing those sequences in order to both collect relevant information concerning the function and activity of the new sequences and/or know what is known already about similar sequences. In current web sites and data bases of sequences there are, usually, a set of curated paper references linked to each sequence. Those links are a good starting point to look for relevant information related to a set of sequences. One way to implement such approach is to do a blast with the new decoded sequences, and collect similar sequences. Then one looks at the papers linked with the similar sequences. Most often the number of retrieved papers is small and one has to search large data bases for relevant papers. This paper proposes a process of generating a classifier based on the initially set of relevant papers. First, the authors collect similar sequences using an alignment algorithm like Blast. Then, the authors use the enlarges set of papers to construct a classifier. Finally a classifier is used to automatically enlarge the set of relevant papers by searching the MEDLINE using the automatically constructed classifier.\",\"PeriodicalId\":160270,\"journal\":{\"name\":\"Int. J. Knowl. Discov. Bioinform.\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Knowl. Discov. Bioinform.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/jkdb.2011070102\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Knowl. Discov. Bioinform.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/jkdb.2011070102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

每当新的DNA或蛋白质序列被解码时，为了收集有关新序列的功能和活性的相关信息和/或了解关于类似序列的已知信息，几乎必须查看类似序列和描述这些序列的论文。在当前的网站和序列数据库中，通常有一组与每个序列相关联的经过整理的论文参考文献。这些链接是查找与一组序列相关信息的良好起点。实现这种方法的一种方法是对新解码的序列进行爆炸，并收集相似的序列。然后再看与相似序列相关的论文。大多数情况下，检索到的论文数量很少，人们必须在大型数据库中搜索相关论文。本文提出了一种基于相关论文初始集生成分类器的过程。首先，作者使用Blast之类的比对算法收集相似的序列。然后，作者使用放大的论文集来构建分类器。最后利用自动构建的分类器对MEDLINE进行搜索，自动扩大相关论文集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

BioTextRetriever: A Tool to Retrieve Relevant Papers

Whenever new sequences of DNA or proteins have been decoded it is almost compulsory to look at similar sequences and papers describing those sequences in order to both collect relevant information concerning the function and activity of the new sequences and/or know what is known already about similar sequences. In current web sites and data bases of sequences there are, usually, a set of curated paper references linked to each sequence. Those links are a good starting point to look for relevant information related to a set of sequences. One way to implement such approach is to do a blast with the new decoded sequences, and collect similar sequences. Then one looks at the papers linked with the similar sequences. Most often the number of retrieved papers is small and one has to search large data bases for relevant papers. This paper proposes a process of generating a classifier based on the initially set of relevant papers. First, the authors collect similar sequences using an alignment algorithm like Blast. Then, the authors use the enlarges set of papers to construct a classifier. Finally a classifier is used to automatically enlarge the set of relevant papers by searching the MEDLINE using the automatically constructed classifier.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Int. J. Knowl. Discov. Bioinform.

自引率

0.00%

发文量