Discovering Motifs in DNA Sequences: A Suffix Tree Based Approach

S. Prakash, Harshit Agarwal, Urvi Agarwal, Prantik Biswas, Suma Dawn Jaypee
{"title":"Discovering Motifs in DNA Sequences: A Suffix Tree Based Approach","authors":"S. Prakash, Harshit Agarwal, Urvi Agarwal, Prantik Biswas, Suma Dawn Jaypee","doi":"10.1109/IADCC.2018.8692107","DOIUrl":null,"url":null,"abstract":"Motif discovery also known as motif finding is a challenging problem in the field of bioinformatics that deals with various computational and statistical techniques to identify short patterns, often referred to as motifs that corresponds to the binding sites in the DNA sequence for transcription factors. Owing to the recent growth of bioinformatics, a good number of algorithms have come into limelight. This paper proposes a competent algorithm that extracts binding sites in set of DNA sequences for transcription factors, using successive iterations on the sequences provided. The motif we work on are of unknown length, un-gapped and non-mutated. The algorithm uses suffix trie for finding such sites. In this approach the first sequence is used as base for constructing the suffix trie and is mapped with other sequences which results in extraction of the motif. Additionally, this algorithm can also be applied to related problems in the field of data mining, pattern detection, etc.","PeriodicalId":365713,"journal":{"name":"2018 IEEE 8th International Advance Computing Conference (IACC)","volume":"35 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 8th International Advance Computing Conference (IACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IADCC.2018.8692107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Motif discovery also known as motif finding is a challenging problem in the field of bioinformatics that deals with various computational and statistical techniques to identify short patterns, often referred to as motifs that corresponds to the binding sites in the DNA sequence for transcription factors. Owing to the recent growth of bioinformatics, a good number of algorithms have come into limelight. This paper proposes a competent algorithm that extracts binding sites in set of DNA sequences for transcription factors, using successive iterations on the sequences provided. The motif we work on are of unknown length, un-gapped and non-mutated. The algorithm uses suffix trie for finding such sites. In this approach the first sequence is used as base for constructing the suffix trie and is mapped with other sequences which results in extraction of the motif. Additionally, this algorithm can also be applied to related problems in the field of data mining, pattern detection, etc.
发现DNA序列中的基序:基于后缀树的方法
基序发现也被称为基序发现,是生物信息学领域的一个具有挑战性的问题,它涉及到各种计算和统计技术来识别短模式,通常被称为基序,对应于转录因子DNA序列中的结合位点。由于近年来生物信息学的发展,许多算法已经成为人们关注的焦点。本文提出了一个胜任的算法,提取结合位点在一组DNA序列的转录因子,使用连续迭代的序列提供。我们研究的基序长度未知,没有缺口,也没有突变。该算法使用后缀trie来查找此类站点。在这种方法中,第一个序列被用作构建后缀trie的基础,并与其他序列进行映射,从而提取motif。此外,该算法还可以应用于数据挖掘、模式检测等领域的相关问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信