发现DNA序列中的基序:基于后缀树的方法

2018 IEEE 8th International Advance Computing Conference (IACC) Pub Date : 2018-12-01 DOI:10.1109/IADCC.2018.8692107

S. Prakash, Harshit Agarwal, Urvi Agarwal, Prantik Biswas, Suma Dawn Jaypee

{"title":"发现DNA序列中的基序:基于后缀树的方法","authors":"S. Prakash, Harshit Agarwal, Urvi Agarwal, Prantik Biswas, Suma Dawn Jaypee","doi":"10.1109/IADCC.2018.8692107","DOIUrl":null,"url":null,"abstract":"Motif discovery also known as motif finding is a challenging problem in the field of bioinformatics that deals with various computational and statistical techniques to identify short patterns, often referred to as motifs that corresponds to the binding sites in the DNA sequence for transcription factors. Owing to the recent growth of bioinformatics, a good number of algorithms have come into limelight. This paper proposes a competent algorithm that extracts binding sites in set of DNA sequences for transcription factors, using successive iterations on the sequences provided. The motif we work on are of unknown length, un-gapped and non-mutated. The algorithm uses suffix trie for finding such sites. In this approach the first sequence is used as base for constructing the suffix trie and is mapped with other sequences which results in extraction of the motif. Additionally, this algorithm can also be applied to related problems in the field of data mining, pattern detection, etc.","PeriodicalId":365713,"journal":{"name":"2018 IEEE 8th International Advance Computing Conference (IACC)","volume":"35 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Discovering Motifs in DNA Sequences: A Suffix Tree Based Approach\",\"authors\":\"S. Prakash, Harshit Agarwal, Urvi Agarwal, Prantik Biswas, Suma Dawn Jaypee\",\"doi\":\"10.1109/IADCC.2018.8692107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Motif discovery also known as motif finding is a challenging problem in the field of bioinformatics that deals with various computational and statistical techniques to identify short patterns, often referred to as motifs that corresponds to the binding sites in the DNA sequence for transcription factors. Owing to the recent growth of bioinformatics, a good number of algorithms have come into limelight. This paper proposes a competent algorithm that extracts binding sites in set of DNA sequences for transcription factors, using successive iterations on the sequences provided. The motif we work on are of unknown length, un-gapped and non-mutated. The algorithm uses suffix trie for finding such sites. In this approach the first sequence is used as base for constructing the suffix trie and is mapped with other sequences which results in extraction of the motif. Additionally, this algorithm can also be applied to related problems in the field of data mining, pattern detection, etc.\",\"PeriodicalId\":365713,\"journal\":{\"name\":\"2018 IEEE 8th International Advance Computing Conference (IACC)\",\"volume\":\"35 6\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 8th International Advance Computing Conference (IACC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IADCC.2018.8692107\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 8th International Advance Computing Conference (IACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IADCC.2018.8692107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

基序发现也被称为基序发现，是生物信息学领域的一个具有挑战性的问题，它涉及到各种计算和统计技术来识别短模式，通常被称为基序，对应于转录因子DNA序列中的结合位点。由于近年来生物信息学的发展，许多算法已经成为人们关注的焦点。本文提出了一个胜任的算法，提取结合位点在一组DNA序列的转录因子，使用连续迭代的序列提供。我们研究的基序长度未知，没有缺口，也没有突变。该算法使用后缀trie来查找此类站点。在这种方法中，第一个序列被用作构建后缀trie的基础，并与其他序列进行映射，从而提取motif。此外，该算法还可以应用于数据挖掘、模式检测等领域的相关问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Discovering Motifs in DNA Sequences: A Suffix Tree Based Approach

Motif discovery also known as motif finding is a challenging problem in the field of bioinformatics that deals with various computational and statistical techniques to identify short patterns, often referred to as motifs that corresponds to the binding sites in the DNA sequence for transcription factors. Owing to the recent growth of bioinformatics, a good number of algorithms have come into limelight. This paper proposes a competent algorithm that extracts binding sites in set of DNA sequences for transcription factors, using successive iterations on the sequences provided. The motif we work on are of unknown length, un-gapped and non-mutated. The algorithm uses suffix trie for finding such sites. In this approach the first sequence is used as base for constructing the suffix trie and is mapped with other sequences which results in extraction of the motif. Additionally, this algorithm can also be applied to related problems in the field of data mining, pattern detection, etc.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 IEEE 8th International Advance Computing Conference (IACC)

自引率

0.00%

发文量