Towards the Development of Tandem Repeat Analyzer for Genome Sequence Data

Eesha Ingle, Abhiram Bhise
{"title":"Towards the Development of Tandem Repeat Analyzer for Genome Sequence Data","authors":"Eesha Ingle, Abhiram Bhise","doi":"10.1109/ICCEA.2010.285","DOIUrl":null,"url":null,"abstract":"A Tandem repeat in DNA is two or more contiguous approximate copies of a pattern of nucleotides. Tandem repeats have been shown to cause human diseases, may play a variety of regulatory and evolutionary roles and are important laboratory and analytical tools. Extensive knowledge about pattern size, mutational history etc for tandem repeats has been limited by the inability to easily detect them in genome sequence data. In this paper, we present an algorithm for finding tandem repeats which works without the need to specify either the pattern or pattern size. We model tandem repeats by percent identity and frequency of indels between adjacent pattern copies and use statistics based recognition criteria. Detection criteria are based on a stochastic model of tandem repeats specified by percent identity and frequency of insertions and deletions rather than some minimal alignment score. Finally, the program aligns repeat copies against a consensus sequence, revealing patterns of common mutations. These patterns yield insight into the history of duplications that produce the tandem repeats thus providing a potentially valuable tool for research","PeriodicalId":207234,"journal":{"name":"2010 Second International Conference on Computer Engineering and Applications","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Second International Conference on Computer Engineering and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCEA.2010.285","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

A Tandem repeat in DNA is two or more contiguous approximate copies of a pattern of nucleotides. Tandem repeats have been shown to cause human diseases, may play a variety of regulatory and evolutionary roles and are important laboratory and analytical tools. Extensive knowledge about pattern size, mutational history etc for tandem repeats has been limited by the inability to easily detect them in genome sequence data. In this paper, we present an algorithm for finding tandem repeats which works without the need to specify either the pattern or pattern size. We model tandem repeats by percent identity and frequency of indels between adjacent pattern copies and use statistics based recognition criteria. Detection criteria are based on a stochastic model of tandem repeats specified by percent identity and frequency of insertions and deletions rather than some minimal alignment score. Finally, the program aligns repeat copies against a consensus sequence, revealing patterns of common mutations. These patterns yield insight into the history of duplications that produce the tandem repeats thus providing a potentially valuable tool for research
基因组序列数据串联重复序列分析仪的研制
DNA中的串联重复序列是一种核苷酸模式的两个或多个相邻的近似拷贝。串联重复序列已被证明可引起人类疾病,可能发挥多种调节和进化作用,是重要的实验室和分析工具。关于串联重复序列的模式大小、突变历史等广泛的知识由于无法在基因组序列数据中轻松检测到它们而受到限制。在本文中,我们提出了一种无需指定模式或模式大小即可找到串联重复的算法。我们通过相邻模式副本之间索引的百分比身份和频率来建模串联重复,并使用基于统计的识别标准。检测标准是基于串联重复序列的随机模型,由一致性百分比和插入和删除的频率指定,而不是一些最小的比对得分。最后,该程序将重复副本与一致序列对齐,揭示共同突变的模式。这些模式可以深入了解产生串联重复序列的复制历史,从而为研究提供了潜在的有价值的工具
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信