关于二进制字符串的重复距离

N. Alon, Jehoshua Bruck, Farzad Farnoud, Siddhartha Jain
{"title":"关于二进制字符串的重复距离","authors":"N. Alon, Jehoshua Bruck, Farzad Farnoud, Siddhartha Jain","doi":"10.1109/ISIT.2016.7541301","DOIUrl":null,"url":null,"abstract":"We study the tandem duplication distance between binary sequences and their roots. This distance is motivated by genomic tandem duplication mutations and counts the smallest number of tandem duplication events that are required to take one sequence to another. We consider both exact and approximate tandem duplications, the latter leading to a combined duplication/Hamming distance. The paper focuses on the maximum value of the duplication distance to the root. For exact duplication, denoting the maximum distance to the root of a sequence of length n by f(n), we prove that f(n) = Θ(n). For the case of approximate duplication, where a β-fraction of symbols may be duplicated incorrectly, we show using the Plotkin bound that the maximum distance has a sharp transition from linear to logarithmic in n at β = 1/2.","PeriodicalId":198767,"journal":{"name":"2016 IEEE International Symposium on Information Theory (ISIT)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"On the duplication distance of binary strings\",\"authors\":\"N. Alon, Jehoshua Bruck, Farzad Farnoud, Siddhartha Jain\",\"doi\":\"10.1109/ISIT.2016.7541301\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We study the tandem duplication distance between binary sequences and their roots. This distance is motivated by genomic tandem duplication mutations and counts the smallest number of tandem duplication events that are required to take one sequence to another. We consider both exact and approximate tandem duplications, the latter leading to a combined duplication/Hamming distance. The paper focuses on the maximum value of the duplication distance to the root. For exact duplication, denoting the maximum distance to the root of a sequence of length n by f(n), we prove that f(n) = Θ(n). For the case of approximate duplication, where a β-fraction of symbols may be duplicated incorrectly, we show using the Plotkin bound that the maximum distance has a sharp transition from linear to logarithmic in n at β = 1/2.\",\"PeriodicalId\":198767,\"journal\":{\"name\":\"2016 IEEE International Symposium on Information Theory (ISIT)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Symposium on Information Theory (ISIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISIT.2016.7541301\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Symposium on Information Theory (ISIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISIT.2016.7541301","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

研究了二值序列与其根之间的串联重复距离。这个距离是由基因组串联重复突变引起的,它计算了从一个序列到另一个序列所需的串联重复事件的最小数量。我们考虑精确和近似串联复制,后者导致组合复制/汉明距离。本文主要研究到根的复制距离的最大值。对于精确复制,表示到长度为n × f(n)的序列根的最大距离,我们证明了f(n) = Θ(n)。对于近似重复的情况,其中β-分数的符号可能被不正确地复制,我们使用Plotkin界表明,在β = 1/2时,最大距离在n上有一个从线性到对数的急剧过渡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On the duplication distance of binary strings
We study the tandem duplication distance between binary sequences and their roots. This distance is motivated by genomic tandem duplication mutations and counts the smallest number of tandem duplication events that are required to take one sequence to another. We consider both exact and approximate tandem duplications, the latter leading to a combined duplication/Hamming distance. The paper focuses on the maximum value of the duplication distance to the root. For exact duplication, denoting the maximum distance to the root of a sequence of length n by f(n), we prove that f(n) = Θ(n). For the case of approximate duplication, where a β-fraction of symbols may be duplicated incorrectly, we show using the Plotkin bound that the maximum distance has a sharp transition from linear to logarithmic in n at β = 1/2.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信