{"title":"一个28纳米完全集成的端到端基因组分析加速器,用于下一代测序。","authors":"Yi-Chung Wu, Yen-Lung Chen, Chung-Hsuan Yang, Chao-Hsi Lee, Wen-Ching Chen, Liang-Yi Lin, Nian-Shyang Chang, Chun-Pin Lin, Chi-Shi Chen, Jui-Hung Hung, Chia-Hsiang Yang","doi":"10.1109/TBCAS.2025.3555579","DOIUrl":null,"url":null,"abstract":"<p><p>This paper presents the first end-to-end next-generation sequencing (NGS) data analysis accelerator for short-read mapping, haplotype calling, variant calling, and genotyping. It supports both single-end and paired-end short-reads (or reads) and uses the FM-index, a compact index data structure, for exact-match in short-read mapping. For inexact match part of short-read mapping, a dynamic programming array is proposed to determine the mapping results. To reduce the workload of short-read mapping, a rapid similarity calculation is designed. A rescue technique is also adopted to increase the overall sensitivity. In haplotype calling, a parallel k-mer processing engine can construct the de Bruijn graph and assemble the haplotypes. The variant calling step determines variants between a subject and a reference genome sequence with a variant discovery engine. Lastly, genotype likelihood is computed in parallel by a genotype likelihood computing engine, which outputs genotypes of all discovered variants and corresponding Phred-scaled likelihood (PL) values. This work completes end-to-end data analysis for the 50× PrecisionFDA dataset in an average of 28.2 minutes. It achieves a 3-to-59× higher throughput than the existing solutions with higher precision (99.79%) and sensitivity (99.03%). The chip also achieves a 935× higher energy efficiency than the Illumina DRAGEN FPGA acceleration system.</p>","PeriodicalId":94031,"journal":{"name":"IEEE transactions on biomedical circuits and systems","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A 28nm Fully Integrated End-to-End Genome Analysis Accelerator for Next-Generation Sequencing.\",\"authors\":\"Yi-Chung Wu, Yen-Lung Chen, Chung-Hsuan Yang, Chao-Hsi Lee, Wen-Ching Chen, Liang-Yi Lin, Nian-Shyang Chang, Chun-Pin Lin, Chi-Shi Chen, Jui-Hung Hung, Chia-Hsiang Yang\",\"doi\":\"10.1109/TBCAS.2025.3555579\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This paper presents the first end-to-end next-generation sequencing (NGS) data analysis accelerator for short-read mapping, haplotype calling, variant calling, and genotyping. It supports both single-end and paired-end short-reads (or reads) and uses the FM-index, a compact index data structure, for exact-match in short-read mapping. For inexact match part of short-read mapping, a dynamic programming array is proposed to determine the mapping results. To reduce the workload of short-read mapping, a rapid similarity calculation is designed. A rescue technique is also adopted to increase the overall sensitivity. In haplotype calling, a parallel k-mer processing engine can construct the de Bruijn graph and assemble the haplotypes. The variant calling step determines variants between a subject and a reference genome sequence with a variant discovery engine. Lastly, genotype likelihood is computed in parallel by a genotype likelihood computing engine, which outputs genotypes of all discovered variants and corresponding Phred-scaled likelihood (PL) values. This work completes end-to-end data analysis for the 50× PrecisionFDA dataset in an average of 28.2 minutes. It achieves a 3-to-59× higher throughput than the existing solutions with higher precision (99.79%) and sensitivity (99.03%). The chip also achieves a 935× higher energy efficiency than the Illumina DRAGEN FPGA acceleration system.</p>\",\"PeriodicalId\":94031,\"journal\":{\"name\":\"IEEE transactions on biomedical circuits and systems\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on biomedical circuits and systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TBCAS.2025.3555579\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biomedical circuits and systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TBCAS.2025.3555579","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A 28nm Fully Integrated End-to-End Genome Analysis Accelerator for Next-Generation Sequencing.
This paper presents the first end-to-end next-generation sequencing (NGS) data analysis accelerator for short-read mapping, haplotype calling, variant calling, and genotyping. It supports both single-end and paired-end short-reads (or reads) and uses the FM-index, a compact index data structure, for exact-match in short-read mapping. For inexact match part of short-read mapping, a dynamic programming array is proposed to determine the mapping results. To reduce the workload of short-read mapping, a rapid similarity calculation is designed. A rescue technique is also adopted to increase the overall sensitivity. In haplotype calling, a parallel k-mer processing engine can construct the de Bruijn graph and assemble the haplotypes. The variant calling step determines variants between a subject and a reference genome sequence with a variant discovery engine. Lastly, genotype likelihood is computed in parallel by a genotype likelihood computing engine, which outputs genotypes of all discovered variants and corresponding Phred-scaled likelihood (PL) values. This work completes end-to-end data analysis for the 50× PrecisionFDA dataset in an average of 28.2 minutes. It achieves a 3-to-59× higher throughput than the existing solutions with higher precision (99.79%) and sensitivity (99.03%). The chip also achieves a 935× higher energy efficiency than the Illumina DRAGEN FPGA acceleration system.