Fast and Accurate Genomic Minisatellites Disclosure

Reza Behboodi, Mahmoud Naghibzadeh, Mostafa Nouri-Baygi
{"title":"Fast and Accurate Genomic Minisatellites Disclosure","authors":"Reza Behboodi, Mahmoud Naghibzadeh, Mostafa Nouri-Baygi","doi":"10.1109/ICCKE50421.2020.9303664","DOIUrl":null,"url":null,"abstract":"Minisatellites are genomic sequences comprised of short monomers which are successively repeated many times in the same direction. They are highly variable sequences in terms of their monomer building blocks, number of repeats, and their location on the genomic sequences. This mutability has made them excellent genetic markers for both linkage analysis and forensic science. This paper presents an accurate and highly efficient computer method for identifying all minisatellites in a given DNA, gene, or genomic sequence. It is based on a new indexing method which is intelligently used in such a way that does not use any main memory or secondary storage for storing search-keys. Furthermore, for each search-key value, which is not stored, a pointer points to a list occurrences of the search-key value in the sequence. Potential minisatellites are detected from these lists and actual ones are recognized. The software is capable of discovering all minisatellites of very large sequences such as human genome with 3.2 Giga base pairs in a very short time. The minimum number of repeats of the motif is set to be 3. An advantage of the software is in the detection of overlapped minisatellites that some state of the art software cannot detect. With respect to the time, looking for minisatellites with the proposed approach would be faster than TRF, Mreps and Dot2dot.","PeriodicalId":402043,"journal":{"name":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"143 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE50421.2020.9303664","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Minisatellites are genomic sequences comprised of short monomers which are successively repeated many times in the same direction. They are highly variable sequences in terms of their monomer building blocks, number of repeats, and their location on the genomic sequences. This mutability has made them excellent genetic markers for both linkage analysis and forensic science. This paper presents an accurate and highly efficient computer method for identifying all minisatellites in a given DNA, gene, or genomic sequence. It is based on a new indexing method which is intelligently used in such a way that does not use any main memory or secondary storage for storing search-keys. Furthermore, for each search-key value, which is not stored, a pointer points to a list occurrences of the search-key value in the sequence. Potential minisatellites are detected from these lists and actual ones are recognized. The software is capable of discovering all minisatellites of very large sequences such as human genome with 3.2 Giga base pairs in a very short time. The minimum number of repeats of the motif is set to be 3. An advantage of the software is in the detection of overlapped minisatellites that some state of the art software cannot detect. With respect to the time, looking for minisatellites with the proposed approach would be faster than TRF, Mreps and Dot2dot.
快速和准确的基因组微型卫星披露
小卫星是由短单体组成的基因组序列,这些短单体在同一方向上连续重复多次。它们在单体构建块、重复次数和基因组序列上的位置方面都是高度可变的序列。这种易变性使它们成为连锁分析和法医学的优秀遗传标记。本文提出了一种准确而高效的计算机方法,用于识别给定DNA,基因或基因组序列中的所有小卫星。它基于一种新的索引方法,这种方法不使用任何主存储器或辅助存储器来存储搜索键。此外,对于每个不存储的搜索键值,一个指针指向序列中搜索键值的出现列表。从这些名单中发现潜在的小卫星,并识别实际的小卫星。该软件可以在很短的时间内发现具有3.2千兆碱基对的人类基因组等所有非常大的序列的迷你卫星。motif的最小重复次数设置为3。该软件的一个优点是可以检测到重叠的微型卫星,这是一些先进软件无法检测到的。就时间而言,使用拟议的方法寻找微型卫星将比TRF、Mreps和Dot2dot更快。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信