biotextretriver:检索相关论文的工具

Célia Talma Gonçalves, Rui Camacho, Eugénio C. Oliveira
{"title":"biotextretriver:检索相关论文的工具","authors":"Célia Talma Gonçalves, Rui Camacho, Eugénio C. Oliveira","doi":"10.4018/jkdb.2011070102","DOIUrl":null,"url":null,"abstract":"Whenever new sequences of DNA or proteins have been decoded it is almost compulsory to look at similar sequences and papers describing those sequences in order to both collect relevant information concerning the function and activity of the new sequences and/or know what is known already about similar sequences. In current web sites and data bases of sequences there are, usually, a set of curated paper references linked to each sequence. Those links are a good starting point to look for relevant information related to a set of sequences. One way to implement such approach is to do a blast with the new decoded sequences, and collect similar sequences. Then one looks at the papers linked with the similar sequences. Most often the number of retrieved papers is small and one has to search large data bases for relevant papers. This paper proposes a process of generating a classifier based on the initially set of relevant papers. First, the authors collect similar sequences using an alignment algorithm like Blast. Then, the authors use the enlarges set of papers to construct a classifier. Finally a classifier is used to automatically enlarge the set of relevant papers by searching the MEDLINE using the automatically constructed classifier.","PeriodicalId":160270,"journal":{"name":"Int. J. Knowl. Discov. Bioinform.","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"BioTextRetriever: A Tool to Retrieve Relevant Papers\",\"authors\":\"Célia Talma Gonçalves, Rui Camacho, Eugénio C. Oliveira\",\"doi\":\"10.4018/jkdb.2011070102\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Whenever new sequences of DNA or proteins have been decoded it is almost compulsory to look at similar sequences and papers describing those sequences in order to both collect relevant information concerning the function and activity of the new sequences and/or know what is known already about similar sequences. In current web sites and data bases of sequences there are, usually, a set of curated paper references linked to each sequence. Those links are a good starting point to look for relevant information related to a set of sequences. One way to implement such approach is to do a blast with the new decoded sequences, and collect similar sequences. Then one looks at the papers linked with the similar sequences. Most often the number of retrieved papers is small and one has to search large data bases for relevant papers. This paper proposes a process of generating a classifier based on the initially set of relevant papers. First, the authors collect similar sequences using an alignment algorithm like Blast. Then, the authors use the enlarges set of papers to construct a classifier. Finally a classifier is used to automatically enlarge the set of relevant papers by searching the MEDLINE using the automatically constructed classifier.\",\"PeriodicalId\":160270,\"journal\":{\"name\":\"Int. J. Knowl. Discov. Bioinform.\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Knowl. Discov. Bioinform.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/jkdb.2011070102\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Knowl. Discov. Bioinform.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/jkdb.2011070102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

每当新的DNA或蛋白质序列被解码时,为了收集有关新序列的功能和活性的相关信息和/或了解关于类似序列的已知信息,几乎必须查看类似序列和描述这些序列的论文。在当前的网站和序列数据库中,通常有一组与每个序列相关联的经过整理的论文参考文献。这些链接是查找与一组序列相关信息的良好起点。实现这种方法的一种方法是对新解码的序列进行爆炸,并收集相似的序列。然后再看与相似序列相关的论文。大多数情况下,检索到的论文数量很少,人们必须在大型数据库中搜索相关论文。本文提出了一种基于相关论文初始集生成分类器的过程。首先,作者使用Blast之类的比对算法收集相似的序列。然后,作者使用放大的论文集来构建分类器。最后利用自动构建的分类器对MEDLINE进行搜索,自动扩大相关论文集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
BioTextRetriever: A Tool to Retrieve Relevant Papers
Whenever new sequences of DNA or proteins have been decoded it is almost compulsory to look at similar sequences and papers describing those sequences in order to both collect relevant information concerning the function and activity of the new sequences and/or know what is known already about similar sequences. In current web sites and data bases of sequences there are, usually, a set of curated paper references linked to each sequence. Those links are a good starting point to look for relevant information related to a set of sequences. One way to implement such approach is to do a blast with the new decoded sequences, and collect similar sequences. Then one looks at the papers linked with the similar sequences. Most often the number of retrieved papers is small and one has to search large data bases for relevant papers. This paper proposes a process of generating a classifier based on the initially set of relevant papers. First, the authors collect similar sequences using an alignment algorithm like Blast. Then, the authors use the enlarges set of papers to construct a classifier. Finally a classifier is used to automatically enlarge the set of relevant papers by searching the MEDLINE using the automatically constructed classifier.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信