Discovering Motifs in DNA Sequences: A Suffix Tree Based Approach

2018 IEEE 8th International Advance Computing Conference (IACC) Pub Date : 2018-12-01 DOI:10.1109/IADCC.2018.8692107

S. Prakash, Harshit Agarwal, Urvi Agarwal, Prantik Biswas, Suma Dawn Jaypee

引用次数: 5

Abstract

Motif discovery also known as motif finding is a challenging problem in the field of bioinformatics that deals with various computational and statistical techniques to identify short patterns, often referred to as motifs that corresponds to the binding sites in the DNA sequence for transcription factors. Owing to the recent growth of bioinformatics, a good number of algorithms have come into limelight. This paper proposes a competent algorithm that extracts binding sites in set of DNA sequences for transcription factors, using successive iterations on the sequences provided. The motif we work on are of unknown length, un-gapped and non-mutated. The algorithm uses suffix trie for finding such sites. In this approach the first sequence is used as base for constructing the suffix trie and is mapped with other sequences which results in extraction of the motif. Additionally, this algorithm can also be applied to related problems in the field of data mining, pattern detection, etc.

查看原文本刊更多论文

发现DNA序列中的基序:基于后缀树的方法

基序发现也被称为基序发现，是生物信息学领域的一个具有挑战性的问题，它涉及到各种计算和统计技术来识别短模式，通常被称为基序，对应于转录因子DNA序列中的结合位点。由于近年来生物信息学的发展，许多算法已经成为人们关注的焦点。本文提出了一个胜任的算法，提取结合位点在一组DNA序列的转录因子，使用连续迭代的序列提供。我们研究的基序长度未知，没有缺口，也没有突变。该算法使用后缀trie来查找此类站点。在这种方法中，第一个序列被用作构建后缀trie的基础，并与其他序列进行映射，从而提取motif。此外，该算法还可以应用于数据挖掘、模式检测等领域的相关问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE 8th International Advance Computing Conference (IACC)

自引率

0.00%

发文量