Synthesis of custom networks of heterogeneous processing elements for complex physical system emulation

Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis Pub Date : 2012-10-07 DOI:10.1145/2380445.2380483

Chen-Chun Huang, Bailey Miller, F. Vahid, T. Givargis

{"title":"Synthesis of custom networks of heterogeneous processing elements for complex physical system emulation","authors":"Chen-Chun Huang, Bailey Miller, F. Vahid, T. Givargis","doi":"10.1145/2380445.2380483","DOIUrl":null,"url":null,"abstract":"Physical system models that consist of thousands of ordinary differential equations can be synthesized to field-programmable gate arrays (FPGAs) for highly-parallelized, real-time physical system emulation. Previous work introduced synthesis of custom networks of homogeneous processing elements, consisting of processing elements that are either all general differential equation solvers or are all custom solvers tailored to solve specific equations. However, a complex physical system model may contain different types of equations such that using only general solvers or only custom solvers does not provide all of the possible speedup. We introduce methods to synthesize a custom network of heterogeneous processing elements for emulating physical systems, where each element is either a general or custom differential equation solver. We show average speedups of 45x over a 3 GHz single-core desktop processor, and of 11x and 20x over a 3 GHz four-core desktop and a 763 MHz NVIDIA graphical processing unit, respectively. Compared to a commercial high-level synthesis tool including regularity extraction, the networks of heterogeneous processing elements were on average 10.8x faster. Compared to homogeneous networks of general and single-type custom processing elements, heterogeneous networks were on average 7x and 6x faster, respectively.","PeriodicalId":268500,"journal":{"name":"Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2380445.2380483","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Physical system models that consist of thousands of ordinary differential equations can be synthesized to field-programmable gate arrays (FPGAs) for highly-parallelized, real-time physical system emulation. Previous work introduced synthesis of custom networks of homogeneous processing elements, consisting of processing elements that are either all general differential equation solvers or are all custom solvers tailored to solve specific equations. However, a complex physical system model may contain different types of equations such that using only general solvers or only custom solvers does not provide all of the possible speedup. We introduce methods to synthesize a custom network of heterogeneous processing elements for emulating physical systems, where each element is either a general or custom differential equation solver. We show average speedups of 45x over a 3 GHz single-core desktop processor, and of 11x and 20x over a 3 GHz four-core desktop and a 763 MHz NVIDIA graphical processing unit, respectively. Compared to a commercial high-level synthesis tool including regularity extraction, the networks of heterogeneous processing elements were on average 10.8x faster. Compared to homogeneous networks of general and single-type custom processing elements, heterogeneous networks were on average 7x and 6x faster, respectively.

查看原文本刊更多论文

复杂物理系统仿真中异构处理元素自定义网络的综合

由数千个常微分方程组成的物理系统模型可以合成为现场可编程门阵列(fpga)，用于高度并行化、实时的物理系统仿真。以前的工作介绍了齐次处理元素的自定义网络的综合，由处理元素组成，这些处理元素要么都是一般微分方程求解器，要么都是专门求解特定方程的自定义求解器。然而，复杂的物理系统模型可能包含不同类型的方程，因此仅使用通用求解器或仅使用自定义求解器并不能提供所有可能的加速。我们介绍了用于模拟物理系统的综合异构处理元素的自定义网络的方法，其中每个元素要么是一般的，要么是自定义的微分方程求解器。我们展示了3 GHz单核桌面处理器的平均速度为45倍，3 GHz四核桌面处理器和763 MHz NVIDIA图形处理单元的平均速度分别为11倍和20倍。与包含规则提取的商业高级合成工具相比，异构处理元素的网络平均快10.8倍。与一般和单一类型定制处理元素的同构网络相比，异构网络的速度平均分别快7倍和6倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis

自引率

0.00%

发文量