8基因型文件转换教程

Muhammad Muneeb, Samuel F. Feng, Andreas Henschel
{"title":"8基因型文件转换教程","authors":"Muhammad Muneeb, Samuel F. Feng, Andreas Henschel","doi":"10.1109/icbcb55259.2022.9802470","DOIUrl":null,"url":null,"abstract":"This article documents the files format conversion procedures for eight different genotype file formats using existing tools like Plink, Samtools, Gtools, and custom code script where necessary. It provides documentation and the corresponding code segment for each conversion to serve conversion procedures in a plate to beginners and researchers to build on top of the existing code to develop enhanced and fast conversion procedures. The code is written in Python and GNU commands, enabling deployment from general-purpose computers to high-performance computing setups. In addition, the documentation is written in the form of the tutorial, highlighting the reason for using a particular step in the conversion procedure and its effect on intermediate genotype data, ultimately enhancing the comprehension abilities of people struggling with file conversion when developing their pipelines for the analysis. In the first version of the documentation, we considered eight file formats: VCF, BED-BIM-FAM, PED-MAP, GEN-SAMPLE, RAW, HAPS-LEGEND-SAMPLE, 23andme, and AncestryDNA.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Tutorial on 8 Genotype Files Conversion\",\"authors\":\"Muhammad Muneeb, Samuel F. Feng, Andreas Henschel\",\"doi\":\"10.1109/icbcb55259.2022.9802470\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article documents the files format conversion procedures for eight different genotype file formats using existing tools like Plink, Samtools, Gtools, and custom code script where necessary. It provides documentation and the corresponding code segment for each conversion to serve conversion procedures in a plate to beginners and researchers to build on top of the existing code to develop enhanced and fast conversion procedures. The code is written in Python and GNU commands, enabling deployment from general-purpose computers to high-performance computing setups. In addition, the documentation is written in the form of the tutorial, highlighting the reason for using a particular step in the conversion procedure and its effect on intermediate genotype data, ultimately enhancing the comprehension abilities of people struggling with file conversion when developing their pipelines for the analysis. In the first version of the documentation, we considered eight file formats: VCF, BED-BIM-FAM, PED-MAP, GEN-SAMPLE, RAW, HAPS-LEGEND-SAMPLE, 23andme, and AncestryDNA.\",\"PeriodicalId\":429633,\"journal\":{\"name\":\"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/icbcb55259.2022.9802470\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icbcb55259.2022.9802470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

本文记录了八种不同基因型文件格式的文件格式转换过程,使用现有工具(如Plink、Samtools、Gtools)和必要时的自定义代码脚本。它为每个转换提供文档和相应的代码段,以便初学者和研究人员在现有代码的基础上构建转换过程,以开发增强和快速的转换过程。代码是用Python和GNU命令编写的,允许从通用计算机部署到高性能计算设置。此外,文档以教程的形式编写,强调了在转换过程中使用特定步骤的原因及其对中间基因型数据的影响,最终增强了在开发分析管道时挣扎于文件转换的人们的理解能力。在文档的第一个版本中,我们考虑了八种文件格式:VCF、BED-BIM-FAM、PED-MAP、GEN-SAMPLE、RAW、HAPS-LEGEND-SAMPLE、23andme和AncestryDNA。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Tutorial on 8 Genotype Files Conversion
This article documents the files format conversion procedures for eight different genotype file formats using existing tools like Plink, Samtools, Gtools, and custom code script where necessary. It provides documentation and the corresponding code segment for each conversion to serve conversion procedures in a plate to beginners and researchers to build on top of the existing code to develop enhanced and fast conversion procedures. The code is written in Python and GNU commands, enabling deployment from general-purpose computers to high-performance computing setups. In addition, the documentation is written in the form of the tutorial, highlighting the reason for using a particular step in the conversion procedure and its effect on intermediate genotype data, ultimately enhancing the comprehension abilities of people struggling with file conversion when developing their pipelines for the analysis. In the first version of the documentation, we considered eight file formats: VCF, BED-BIM-FAM, PED-MAP, GEN-SAMPLE, RAW, HAPS-LEGEND-SAMPLE, 23andme, and AncestryDNA.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信