GWarrange: a pre- and post- genome-wide association studies pipeline for detecting phenotype-associated genome rearrangement events.

IF 4 2区 生物学 Q1 GENETICS & HEREDITY
Yi Ling Tam, Sarah Cameron, Andrew Preston, Lauren Cowley
{"title":"<i>GWarrange</i>: a pre- and post- genome-wide association studies pipeline for detecting phenotype-associated genome rearrangement events.","authors":"Yi Ling Tam, Sarah Cameron, Andrew Preston, Lauren Cowley","doi":"10.1099/mgen.0.001268","DOIUrl":null,"url":null,"abstract":"<p><p>The use of <i>k</i>-mers to capture genetic variation in bacterial genome-wide association studies (bGWAS) has demonstrated its effectiveness in overcoming the plasticity of bacterial genomes by providing a comprehensive array of genetic variants in a genome set that is not confined to a single reference genome. However, little attempt has been made to interpret <i>k</i>-mers in the context of genome rearrangements, partly due to challenges in the exhaustive and high-throughput identification of genome structure and individual rearrangement events. Here, we present <i>GWarrange</i>, a pre- and post-bGWAS processing methodology that leverages the unique properties of <i>k</i>-mers to facilitate bGWAS for genome rearrangements. Repeat sequences are common instigators of genome rearrangements through intragenomic homologous recombination, and they are commonly found at rearrangement boundaries. Using whole-genome sequences, repeat sequences are replaced by short placeholder sequences, allowing the regions flanking repeats to be incorporated into relatively short <i>k</i>-mers. Then, locations of flanking regions in significant <i>k</i>-mers are mapped back to complete genome sequences to visualise genome rearrangements. Four case studies based on two bacterial species (<i>Bordetella pertussis</i> and <i>Enterococcus faecium</i>) and a simulated genome set are presented to demonstrate the ability to identify phenotype-associated rearrangements. <i>GWarrange</i> is available at https://github.com/DorothyTamYiLing/GWarrange.</p>","PeriodicalId":18487,"journal":{"name":"Microbial Genomics","volume":null,"pages":null},"PeriodicalIF":4.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11316554/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microbial Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1099/mgen.0.001268","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

The use of k-mers to capture genetic variation in bacterial genome-wide association studies (bGWAS) has demonstrated its effectiveness in overcoming the plasticity of bacterial genomes by providing a comprehensive array of genetic variants in a genome set that is not confined to a single reference genome. However, little attempt has been made to interpret k-mers in the context of genome rearrangements, partly due to challenges in the exhaustive and high-throughput identification of genome structure and individual rearrangement events. Here, we present GWarrange, a pre- and post-bGWAS processing methodology that leverages the unique properties of k-mers to facilitate bGWAS for genome rearrangements. Repeat sequences are common instigators of genome rearrangements through intragenomic homologous recombination, and they are commonly found at rearrangement boundaries. Using whole-genome sequences, repeat sequences are replaced by short placeholder sequences, allowing the regions flanking repeats to be incorporated into relatively short k-mers. Then, locations of flanking regions in significant k-mers are mapped back to complete genome sequences to visualise genome rearrangements. Four case studies based on two bacterial species (Bordetella pertussis and Enterococcus faecium) and a simulated genome set are presented to demonstrate the ability to identify phenotype-associated rearrangements. GWarrange is available at https://github.com/DorothyTamYiLing/GWarrange.

GWarrange:用于检测与表型相关的基因组重排事件的全基因组关联研究前后管道。
在细菌全基因组关联研究(bGWAS)中使用 k-mers 来捕捉遗传变异,通过在基因组集中提供不局限于单一参考基因组的全面遗传变异阵列,证明了其在克服细菌基因组可塑性方面的有效性。然而,人们很少尝试在基因组重排的背景下解释 k-mers,部分原因是在基因组结构和单个重排事件的详尽和高通量鉴定方面存在挑战。在这里,我们介绍了 GWarrange,这是一种 bGWAS 前后处理方法,它利用 k-mers 的独特性质促进基因组重排的 bGWAS。重复序列是通过基因组内同源重组导致基因组重排的常见诱因,它们通常出现在重排边界。利用全基因组序列,重复序列会被短的占位序列取代,从而使重复序列的侧翼区域被整合到相对较短的 k-mers 中。然后,将重要 k-mers 的侧翼区域位置映射回完整的基因组序列,以直观地显示基因组重排的情况。本文介绍了基于两个细菌物种(百日咳杆菌和粪肠球菌)和一个模拟基因组集的四个案例研究,以展示识别表型相关重排的能力。GWarrange 可在 https://github.com/DorothyTamYiLing/GWarrange 网站上查阅。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Microbial Genomics
Microbial Genomics Medicine-Epidemiology
CiteScore
6.60
自引率
2.60%
发文量
153
审稿时长
12 weeks
期刊介绍: Microbial Genomics (MGen) is a fully open access, mandatory open data and peer-reviewed journal publishing high-profile original research on archaea, bacteria, microbial eukaryotes and viruses.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信