{"title":"8基因型文件转换教程","authors":"Muhammad Muneeb, Samuel F. Feng, Andreas Henschel","doi":"10.1109/icbcb55259.2022.9802470","DOIUrl":null,"url":null,"abstract":"This article documents the files format conversion procedures for eight different genotype file formats using existing tools like Plink, Samtools, Gtools, and custom code script where necessary. It provides documentation and the corresponding code segment for each conversion to serve conversion procedures in a plate to beginners and researchers to build on top of the existing code to develop enhanced and fast conversion procedures. The code is written in Python and GNU commands, enabling deployment from general-purpose computers to high-performance computing setups. In addition, the documentation is written in the form of the tutorial, highlighting the reason for using a particular step in the conversion procedure and its effect on intermediate genotype data, ultimately enhancing the comprehension abilities of people struggling with file conversion when developing their pipelines for the analysis. In the first version of the documentation, we considered eight file formats: VCF, BED-BIM-FAM, PED-MAP, GEN-SAMPLE, RAW, HAPS-LEGEND-SAMPLE, 23andme, and AncestryDNA.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Tutorial on 8 Genotype Files Conversion\",\"authors\":\"Muhammad Muneeb, Samuel F. Feng, Andreas Henschel\",\"doi\":\"10.1109/icbcb55259.2022.9802470\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article documents the files format conversion procedures for eight different genotype file formats using existing tools like Plink, Samtools, Gtools, and custom code script where necessary. It provides documentation and the corresponding code segment for each conversion to serve conversion procedures in a plate to beginners and researchers to build on top of the existing code to develop enhanced and fast conversion procedures. The code is written in Python and GNU commands, enabling deployment from general-purpose computers to high-performance computing setups. In addition, the documentation is written in the form of the tutorial, highlighting the reason for using a particular step in the conversion procedure and its effect on intermediate genotype data, ultimately enhancing the comprehension abilities of people struggling with file conversion when developing their pipelines for the analysis. In the first version of the documentation, we considered eight file formats: VCF, BED-BIM-FAM, PED-MAP, GEN-SAMPLE, RAW, HAPS-LEGEND-SAMPLE, 23andme, and AncestryDNA.\",\"PeriodicalId\":429633,\"journal\":{\"name\":\"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/icbcb55259.2022.9802470\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icbcb55259.2022.9802470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This article documents the files format conversion procedures for eight different genotype file formats using existing tools like Plink, Samtools, Gtools, and custom code script where necessary. It provides documentation and the corresponding code segment for each conversion to serve conversion procedures in a plate to beginners and researchers to build on top of the existing code to develop enhanced and fast conversion procedures. The code is written in Python and GNU commands, enabling deployment from general-purpose computers to high-performance computing setups. In addition, the documentation is written in the form of the tutorial, highlighting the reason for using a particular step in the conversion procedure and its effect on intermediate genotype data, ultimately enhancing the comprehension abilities of people struggling with file conversion when developing their pipelines for the analysis. In the first version of the documentation, we considered eight file formats: VCF, BED-BIM-FAM, PED-MAP, GEN-SAMPLE, RAW, HAPS-LEGEND-SAMPLE, 23andme, and AncestryDNA.