Ming Ling;Zhihao Feng;Ruiqi Chen;Yi Shao;Shidi Tang;Yanxiang Zhu
{"title":"Vina-FPGA-Cluster: Multi-FPGA Based Molecular Docking Tool With High-Accuracy and Multi-Level Parallelism","authors":"Ming Ling;Zhihao Feng;Ruiqi Chen;Yi Shao;Shidi Tang;Yanxiang Zhu","doi":"10.1109/TBCAS.2024.3388323","DOIUrl":null,"url":null,"abstract":"AutoDock Vina (Vina) stands out among numerous molecular docking tools due to its precision and comparatively high speed, playing a key role in the drug discovery process. Hardware acceleration of Vina on FPGA platforms offers a high energy-efficiency approach to speed up the docking process. However, previous FPGA-based Vina accelerators exhibit several shortcomings: 1) Simple uniform quantization results in inevitable accuracy drop; 2) Due to Vina's complex computing process, the evaluation and optimization phase for hardware design becomes extended; 3) The iterative computations in Vina constrain the potential for further parallelization. 4) The system's scalability is limited by its unwieldy architecture. To address the above challenges, we propose Vina-FPGA-cluster, a multi-FPGA-based molecular docking tool enabling high-accuracy and multi-level parallel Vina acceleration. Standing upon the shoulders of Vina-FPGA, we first adapt hybrid fixed-point quantization to minimize accuracy loss. We then propose a SystemC-based model, accelerating the hardware accelerator architecture design evaluation. Next, we propose a novel bidirectional AG module for data-level parallelism. Finally, we optimize the system architecture for scalable deployment on multiple Xilinx ZCU104 boards, achieving task-level parallelism. Vina-FPGA-cluster is tested on three representative molecular docking datasets. The experiment results indicate that in the context of RMSD (for successful docking outcomes with metrics below 2Å), Vina-FPGA-cluster shows a mere 0.2% lose. Relative to CPU and Vina-FPGA, Vina-FPGA-cluster achieves 27.33\n<inline-formula><tex-math>$\\times$</tex-math></inline-formula>\n and 7.26\n<inline-formula><tex-math>$\\times$</tex-math></inline-formula>\n speedup, respectively. Notably, Vina-FPGA-cluster is able to deliver the 1.38\n<inline-formula><tex-math>$\\times$</tex-math></inline-formula>\n speedup as GPU implementation (Vina-GPU), with just the 28.99% power consumption.","PeriodicalId":94031,"journal":{"name":"IEEE transactions on biomedical circuits and systems","volume":"18 6","pages":"1321-1337"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biomedical circuits and systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10500753/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
AutoDock Vina (Vina) stands out among numerous molecular docking tools due to its precision and comparatively high speed, playing a key role in the drug discovery process. Hardware acceleration of Vina on FPGA platforms offers a high energy-efficiency approach to speed up the docking process. However, previous FPGA-based Vina accelerators exhibit several shortcomings: 1) Simple uniform quantization results in inevitable accuracy drop; 2) Due to Vina's complex computing process, the evaluation and optimization phase for hardware design becomes extended; 3) The iterative computations in Vina constrain the potential for further parallelization. 4) The system's scalability is limited by its unwieldy architecture. To address the above challenges, we propose Vina-FPGA-cluster, a multi-FPGA-based molecular docking tool enabling high-accuracy and multi-level parallel Vina acceleration. Standing upon the shoulders of Vina-FPGA, we first adapt hybrid fixed-point quantization to minimize accuracy loss. We then propose a SystemC-based model, accelerating the hardware accelerator architecture design evaluation. Next, we propose a novel bidirectional AG module for data-level parallelism. Finally, we optimize the system architecture for scalable deployment on multiple Xilinx ZCU104 boards, achieving task-level parallelism. Vina-FPGA-cluster is tested on three representative molecular docking datasets. The experiment results indicate that in the context of RMSD (for successful docking outcomes with metrics below 2Å), Vina-FPGA-cluster shows a mere 0.2% lose. Relative to CPU and Vina-FPGA, Vina-FPGA-cluster achieves 27.33
$\times$
and 7.26
$\times$
speedup, respectively. Notably, Vina-FPGA-cluster is able to deliver the 1.38
$\times$
speedup as GPU implementation (Vina-GPU), with just the 28.99% power consumption.