基于禁忌搜索的转座因子演化预测

Lingling Jin, Ian McQuillan
{"title":"基于禁忌搜索的转座因子演化预测","authors":"Lingling Jin, Ian McQuillan","doi":"10.1109/BIBM.2018.8621478","DOIUrl":null,"url":null,"abstract":"Transposable elements (TEs) are DNA sequences that can move or copy to new positions within a genome. Due to their abundance in many species, predicting the evolution of these TEs within a genome is a major component of understanding the evolution of the genome generally. The sequential interruption model is defined between TEs that occur in a single genome, which has been shown to be useful in previous literature in predicting TE ages and periods of activity throughout evolution. This model is closely related to a classic matrix optimization problem: the linear ordering problem (LOP). By applying a well-studied method of solving the LOP, tabu search, to the sequential interruption model, a relative age order of all TEs in the human genome is predicted in only 38 seconds. A comparison of the TE ordering between tabu search and the previously existing method shows that tabu search solves the TE problem exceedingly more efficiently, while it still achieves a more accurate result. The speed improvements allow a complete prediction of human TEs to be made, whereas previously, ordering of only a small portion of human TEs could be predicted. A simulation of TE transpositions throughout evolution is then developed and used as a form of in silico verification to the sequential interruption model. By feeding the simulated TE remnants and activity data into the model, a relative age order is predicted using the sequential interruption model, and a quantified correlation between this predicted order and the input (true) age order in the simulation can be calculated. An average correlation over ten simulations is calculated as 0.738 with the correct simulated answer.","PeriodicalId":108667,"journal":{"name":"2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of transposable elements evolution using tabu search\",\"authors\":\"Lingling Jin, Ian McQuillan\",\"doi\":\"10.1109/BIBM.2018.8621478\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transposable elements (TEs) are DNA sequences that can move or copy to new positions within a genome. Due to their abundance in many species, predicting the evolution of these TEs within a genome is a major component of understanding the evolution of the genome generally. The sequential interruption model is defined between TEs that occur in a single genome, which has been shown to be useful in previous literature in predicting TE ages and periods of activity throughout evolution. This model is closely related to a classic matrix optimization problem: the linear ordering problem (LOP). By applying a well-studied method of solving the LOP, tabu search, to the sequential interruption model, a relative age order of all TEs in the human genome is predicted in only 38 seconds. A comparison of the TE ordering between tabu search and the previously existing method shows that tabu search solves the TE problem exceedingly more efficiently, while it still achieves a more accurate result. The speed improvements allow a complete prediction of human TEs to be made, whereas previously, ordering of only a small portion of human TEs could be predicted. A simulation of TE transpositions throughout evolution is then developed and used as a form of in silico verification to the sequential interruption model. By feeding the simulated TE remnants and activity data into the model, a relative age order is predicted using the sequential interruption model, and a quantified correlation between this predicted order and the input (true) age order in the simulation can be calculated. An average correlation over ten simulations is calculated as 0.738 with the correct simulated answer.\",\"PeriodicalId\":108667,\"journal\":{\"name\":\"2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBM.2018.8621478\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2018.8621478","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

转座因子(te)是一种可以在基因组内移动或复制到新位置的DNA序列。由于它们在许多物种中都很丰富,因此预测基因组中这些te的进化是理解基因组进化的主要组成部分。序列中断模型定义在单个基因组中发生的TE之间,这在先前的文献中被证明在预测TE的年龄和整个进化过程中的活动时期是有用的。该模型与经典的矩阵优化问题——线性排序问题(LOP)密切相关。通过对序列中断模型应用一种经过充分研究的求解LOP(禁忌搜索)的方法,仅需38秒即可预测出人类基因组中所有te的相对年龄顺序。将禁忌搜索与已有方法的TE排序结果进行比较,发现禁忌搜索在解决TE问题的效率上有极大提高,同时得到的结果也更加准确。速度的提高可以对人类te进行完整的预测,而以前只能预测一小部分人类te的排序。然后开发了整个进化过程中TE换位的模拟,并将其用作顺序中断模型的一种计算机验证形式。通过将模拟的TE残余和活动数据输入到模型中,使用顺序中断模型预测相对年龄顺序,并可以计算出该预测顺序与模拟中输入(真实)年龄顺序之间的量化相关性。10次模拟的平均相关性与正确的模拟答案计算为0.738。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Prediction of transposable elements evolution using tabu search
Transposable elements (TEs) are DNA sequences that can move or copy to new positions within a genome. Due to their abundance in many species, predicting the evolution of these TEs within a genome is a major component of understanding the evolution of the genome generally. The sequential interruption model is defined between TEs that occur in a single genome, which has been shown to be useful in previous literature in predicting TE ages and periods of activity throughout evolution. This model is closely related to a classic matrix optimization problem: the linear ordering problem (LOP). By applying a well-studied method of solving the LOP, tabu search, to the sequential interruption model, a relative age order of all TEs in the human genome is predicted in only 38 seconds. A comparison of the TE ordering between tabu search and the previously existing method shows that tabu search solves the TE problem exceedingly more efficiently, while it still achieves a more accurate result. The speed improvements allow a complete prediction of human TEs to be made, whereas previously, ordering of only a small portion of human TEs could be predicted. A simulation of TE transpositions throughout evolution is then developed and used as a form of in silico verification to the sequential interruption model. By feeding the simulated TE remnants and activity data into the model, a relative age order is predicted using the sequential interruption model, and a quantified correlation between this predicted order and the input (true) age order in the simulation can be calculated. An average correlation over ten simulations is calculated as 0.738 with the correct simulated answer.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信