FSTest: an efficient tool for cross-population fixation index estimation on variant call format files.

IF 2.9 4区 生物学 Q1 EDUCATION & EDUCATIONAL RESEARCH
Journal of Genetics Pub Date : 2024-01-01
Seyed Milad Vahedi, Siavash Salek Ardestani
{"title":"FSTest: an efficient tool for cross-population fixation index estimation on variant call format files.","authors":"Seyed Milad Vahedi, Siavash Salek Ardestani","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Fixation index (<i>F<sub>st</sub></i>) statistics provide critical insights into evolutionary processes affecting the structure of genetic variation within and among populations. <i>F<sub>st</sub></i> statistics have been widely applied in population and evolutionary genetics to identify genomic regions targeted by selection pressures. The FSTest 1.3 software was developed to estimate four <i>F<sub>st</sub></i> statistics of Hudson, Weir and Cockerham, Nei, and Wright using high-throughput genotyping or sequencing data. Here, we introduced FSTest 1.3 and compared its performance with two widely used software VCFtools 0.1.16 and PLINK 2.0. Chromosome 1 of 1000 Genomes Phase III variant data belonging to South Asian (<i>n</i> = 211) and African (<i>n</i> = 274) populations were included as an example case in this study. Different <i>F<sub>st</sub></i> estimates were calculated for each single-nucleotide polymorphism (SNP) in a pairwise comparison of South Asian against African populations, and the results of FSTest 1.3 were confirmed by VCFtools 0.1.16 and PLINK 2.0. Two different sliding window approaches, one based on a fixed number of SNPs and another based on a fixed number of base pair (bp) were conducted using FSTest 1.3 and VCFtools 0.1.16. Our results showed that regions with low coverage genotypic data could lead to an overestimation of <i>F<sub>st</sub></i> in sliding window analysis using a fixed number of bp. FSTest 1.3 could mitigate this challenge by estimating the average of consecutive SNPs along the chromosome. FSTest 1.3 allows direct analysis of VCF files with a small amount of code and can calculate <i>F<sub>st</sub></i> estimates on a desktop computer for more than a million SNPs in a few minutes. FSTest 1.3 is freely available at https://github.com/similab/FSTest.</p>","PeriodicalId":15907,"journal":{"name":"Journal of Genetics","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Genetics","FirstCategoryId":"99","ListUrlMain":"","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0

Abstract

Fixation index (Fst) statistics provide critical insights into evolutionary processes affecting the structure of genetic variation within and among populations. Fst statistics have been widely applied in population and evolutionary genetics to identify genomic regions targeted by selection pressures. The FSTest 1.3 software was developed to estimate four Fst statistics of Hudson, Weir and Cockerham, Nei, and Wright using high-throughput genotyping or sequencing data. Here, we introduced FSTest 1.3 and compared its performance with two widely used software VCFtools 0.1.16 and PLINK 2.0. Chromosome 1 of 1000 Genomes Phase III variant data belonging to South Asian (n = 211) and African (n = 274) populations were included as an example case in this study. Different Fst estimates were calculated for each single-nucleotide polymorphism (SNP) in a pairwise comparison of South Asian against African populations, and the results of FSTest 1.3 were confirmed by VCFtools 0.1.16 and PLINK 2.0. Two different sliding window approaches, one based on a fixed number of SNPs and another based on a fixed number of base pair (bp) were conducted using FSTest 1.3 and VCFtools 0.1.16. Our results showed that regions with low coverage genotypic data could lead to an overestimation of Fst in sliding window analysis using a fixed number of bp. FSTest 1.3 could mitigate this challenge by estimating the average of consecutive SNPs along the chromosome. FSTest 1.3 allows direct analysis of VCF files with a small amount of code and can calculate Fst estimates on a desktop computer for more than a million SNPs in a few minutes. FSTest 1.3 is freely available at https://github.com/similab/FSTest.

FSTest:对变异调用格式文件进行跨种群固定指数估算的高效工具。
固定指数(Fst)统计为了解影响种群内部和种群之间遗传变异结构的进化过程提供了重要线索。Fst 统计已广泛应用于群体遗传学和进化遗传学,以确定选择压力所针对的基因组区域。FSTest 1.3 软件是利用高通量基因分型或测序数据估算 Hudson、Weir 和 Cockerham、Nei 和 Wright 的四种 Fst 统计量而开发的。在此,我们介绍了 FSTest 1.3,并将其性能与两款广泛使用的软件 VCFtools 0.1.16 和 PLINK 2.0 进行了比较。本研究以属于南亚(n = 211)和非洲(n = 274)人群的 1000 基因组第三阶段变异数据 1 号染色体为例。在南亚人与非洲人的配对比较中,对每个单核苷酸多态性(SNP)计算了不同的 Fst 估计值,并通过 VCFtools 0.1.16 和 PLINK 2.0 确认了 FSTest 1.3 的结果。使用 FSTest 1.3 和 VCFtools 0.1.16 进行了两种不同的滑动窗口方法,一种基于固定数量的 SNPs,另一种基于固定数量的碱基对 (bp)。结果表明,在使用固定碱基对数的滑动窗口分析中,基因型数据覆盖率低的区域可能会导致 Fst 被高估。FSTest 1.3 可以通过估算染色体上连续 SNP 的平均值来缓解这一难题。FSTest 1.3 只需少量代码就能直接分析 VCF 文件,并能在几分钟内在台式电脑上计算出超过一百万 SNPs 的 Fst 估计值。FSTest 1.3 可在 https://github.com/similab/FSTest 免费获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Genetics
Journal of Genetics 生物-遗传学
CiteScore
3.10
自引率
0.00%
发文量
72
审稿时长
1 months
期刊介绍: The journal retains its traditional interest in evolutionary research that is of relevance to geneticists, even if this is not explicitly genetical in nature. The journal covers all areas of genetics and evolution,including molecular genetics and molecular evolution.It publishes papers and review articles on current topics, commentaries and essayson ideas and trends in genetics and evolutionary biology, historical developments, debates and book reviews. From 2010 onwards, the journal has published a special category of papers termed ‘Online Resources’. These are brief reports on the development and the routine use of molecular markers for assessing genetic variability within and among species. Also published are reports outlining pedagogical approaches in genetics teaching.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信