A framework for decision tree-based method to index data from large protein sequence databases

K. Jaber, R. Abdullah, N. Rashid
{"title":"A framework for decision tree-based method to index data from large protein sequence databases","authors":"K. Jaber, R. Abdullah, N. Rashid","doi":"10.1109/IECBES.2010.5742212","DOIUrl":null,"url":null,"abstract":"Currently, the size of biological databases has increased significantly with the growing number of users and the rate of queries where some databases are of terabyte size. Hence, there is an increasing need to access databases at the fastest possible rate. Where biologists are concerned, the need is more of a means to fast, scalable and accuracy searching in biological databases. This may seem to be a simple task, given the speed of current available gigabytes processors. However, this is far from the truth as the growing number of data which are deposited into the database are ever increasing. Hence, searching the database becomes a difficult and time-consuming task. Here, the computer scientist can help to organize data in a way that allows biologists to quickly search existing information and to predict new entries. In this paper, a decision tree indexing method is presented. This method of indexing can effectively and rapidly retrieve all the similar proteins from a large database for a given protein query. A theoretical and conceptual frameworks is derived, based on published works using indexing techniques for different applications.","PeriodicalId":241343,"journal":{"name":"2010 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IECBES.2010.5742212","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Currently, the size of biological databases has increased significantly with the growing number of users and the rate of queries where some databases are of terabyte size. Hence, there is an increasing need to access databases at the fastest possible rate. Where biologists are concerned, the need is more of a means to fast, scalable and accuracy searching in biological databases. This may seem to be a simple task, given the speed of current available gigabytes processors. However, this is far from the truth as the growing number of data which are deposited into the database are ever increasing. Hence, searching the database becomes a difficult and time-consuming task. Here, the computer scientist can help to organize data in a way that allows biologists to quickly search existing information and to predict new entries. In this paper, a decision tree indexing method is presented. This method of indexing can effectively and rapidly retrieve all the similar proteins from a large database for a given protein query. A theoretical and conceptual frameworks is derived, based on published works using indexing techniques for different applications.
基于决策树的大型蛋白质序列数据库数据索引方法框架
目前,随着用户数量的增加和查询速度的增加,生物数据库的大小显著增加,其中一些数据库的大小达到tb级。因此,越来越需要以尽可能快的速度访问数据库。对于生物学家来说,更需要的是一种在生物数据库中快速、可扩展和准确搜索的方法。考虑到当前可用的千兆处理器的速度,这似乎是一项简单的任务。然而,事实远非如此,因为存入数据库的数据数量不断增加。因此,搜索数据库成为一项困难且耗时的任务。在这里,计算机科学家可以帮助以一种允许生物学家快速搜索现有信息并预测新条目的方式组织数据。本文提出了一种决策树索引方法。这种索引方法可以有效、快速地从一个给定的蛋白质查询中检索到所有相似的蛋白质。一个理论和概念框架是派生的,基于已发表的作品使用索引技术为不同的应用程序。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信