基于FPGA的SCFGS对RNA二级结构预测的细粒度并行计算

Y. Dou, Fei Xia, Jingfei Jiang
{"title":"基于FPGA的SCFGS对RNA二级结构预测的细粒度并行计算","authors":"Y. Dou, Fei Xia, Jingfei Jiang","doi":"10.1145/1629395.1629412","DOIUrl":null,"url":null,"abstract":"In the field of RNA secondary structure prediction, the CYK (Coche-Younger-Kasami) algorithm is a most popular methods using SCFG (stochastic context-free grammars) model. However, general purpose parallel computers including SMP multiprocessors or cluster systems exhibit low parallel efficiency and they are too expensive to be used easily for many research institutes. FPGA chips provide a new approach to accelerate the CYK algorithm by exploiting fine-grained custom design. The CYK algorithm shows complicated data dependence, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master PE and multiple slave PEs for fine grain hardware implementation on FPGA. We partition tasks by columns and assign tasks to PEs for load balance. We exploit data reuse schemes to reduce the need to load matrix from external memory. To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete CYK/inside algorithm. The experimental results show a factor of more than 14 speedup over the Infernal-0.55 software running on a PC platform with Pentium 4 2.66GHz CPU. The computational power of our platform with FPGA accelerator is comparable to a PC cluster consisting of 20 Intel-Xeon CPUs for RNA secondary structure prediction using SCFGs, but the hardware cost and power consumption is only about 15% and 10% of the latter respectively.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Fine-grained parallel application specific computing for RNA secondary structure prediction using SCFGS on FPGA\",\"authors\":\"Y. Dou, Fei Xia, Jingfei Jiang\",\"doi\":\"10.1145/1629395.1629412\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the field of RNA secondary structure prediction, the CYK (Coche-Younger-Kasami) algorithm is a most popular methods using SCFG (stochastic context-free grammars) model. However, general purpose parallel computers including SMP multiprocessors or cluster systems exhibit low parallel efficiency and they are too expensive to be used easily for many research institutes. FPGA chips provide a new approach to accelerate the CYK algorithm by exploiting fine-grained custom design. The CYK algorithm shows complicated data dependence, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master PE and multiple slave PEs for fine grain hardware implementation on FPGA. We partition tasks by columns and assign tasks to PEs for load balance. We exploit data reuse schemes to reduce the need to load matrix from external memory. To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete CYK/inside algorithm. The experimental results show a factor of more than 14 speedup over the Infernal-0.55 software running on a PC platform with Pentium 4 2.66GHz CPU. The computational power of our platform with FPGA accelerator is comparable to a PC cluster consisting of 20 Intel-Xeon CPUs for RNA secondary structure prediction using SCFGs, but the hardware cost and power consumption is only about 15% and 10% of the latter respectively.\",\"PeriodicalId\":136293,\"journal\":{\"name\":\"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1629395.1629412\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1629395.1629412","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

在RNA二级结构预测领域,CYK (Coche-Younger-Kasami)算法是一种使用SCFG(随机上下文自由语法)模型的最流行的方法。然而,包括SMP多处理器或集群系统在内的通用并行计算机表现出较低的并行效率,而且它们太昂贵而无法被许多研究机构轻易使用。FPGA芯片利用细粒度定制设计为加速CYK算法提供了一种新方法。CYK算法表现出复杂的数据依赖关系,其中依赖距离是可变的,而且依赖方向也是跨二维的。我们提出了一个包含一个主PE和多个从PE的收缩阵列结构,用于FPGA上的细粒度硬件实现。我们按列划分任务,并将任务分配给pe以实现负载平衡。我们利用数据重用方案来减少从外部存储器加载矩阵的需要。据我们所知,我们的16 pe实现是唯一实现完整CYK/inside算法的FPGA加速器。实验结果表明,在Pentium 4 2.66GHz CPU的PC平台上运行的inter -0.55软件的速度提高了14倍以上。采用scfg进行RNA二级结构预测时,我们的FPGA加速器平台的计算能力与20个Intel-Xeon cpu组成的PC集群相当,但硬件成本和功耗分别仅为后者的15%和10%左右。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Fine-grained parallel application specific computing for RNA secondary structure prediction using SCFGS on FPGA
In the field of RNA secondary structure prediction, the CYK (Coche-Younger-Kasami) algorithm is a most popular methods using SCFG (stochastic context-free grammars) model. However, general purpose parallel computers including SMP multiprocessors or cluster systems exhibit low parallel efficiency and they are too expensive to be used easily for many research institutes. FPGA chips provide a new approach to accelerate the CYK algorithm by exploiting fine-grained custom design. The CYK algorithm shows complicated data dependence, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master PE and multiple slave PEs for fine grain hardware implementation on FPGA. We partition tasks by columns and assign tasks to PEs for load balance. We exploit data reuse schemes to reduce the need to load matrix from external memory. To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete CYK/inside algorithm. The experimental results show a factor of more than 14 speedup over the Infernal-0.55 software running on a PC platform with Pentium 4 2.66GHz CPU. The computational power of our platform with FPGA accelerator is comparable to a PC cluster consisting of 20 Intel-Xeon CPUs for RNA secondary structure prediction using SCFGs, but the hardware cost and power consumption is only about 15% and 10% of the latter respectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信