{"title":"Mapping instruction sequences onto EPOM-processor arrays: a framework for parallel data processing","authors":"Jean-Paul Theis, Harald Schlimper","doi":"10.1109/HIPC.1998.737977","DOIUrl":null,"url":null,"abstract":"The paper introduces an optimized mapping methodology for mapping instruction sequences (ISs) onto EPOM-processor arrays. The new features of this mapping methodology result from a systematic specification and exploitation of both instruction and processor level parallelism: ultra-low granularity of ISs requires an allocation and scheduling of individual instructions onto the given processor array. Moreover, this mapping methodology is complete in the sense that it considers both array bus-bandwidths and processor resource constraints. The mapping methodology is based on two concepts: 1) instruction sequences (ISs) which represent a generalized form of directed cyclic graphs (DCGs) and allow efficient specification of algorithm parallelism, and graph nodes represent instructions from the instruction set of a target processor architecture (J.P. Theis, 1997); 2) the EPOM-processor architecture which represents an optimized target VLIW processor architecture for parallel implementation of ISs (J.P. Theis and L. Thiele, 1996) and especially suited for parallel image/multimedia processing (J.P. Theis and L. Thiele, 1995). Special attention is paid to the optimization, of the mapping process of ISs onto EPOM-processor arrays. Algorithm execution time minimization is used as optimization goal. The mapping methodology is partially based on integer linear programming and heuristic techniques. The solution time complexity is substantially reduced by developing a two-phase hierarchical model, decoupling processor array allocation from subsequent scheduling. The efficiency of this mapping methodology was validated through experimental results on ISs of well known algorithm routines.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"6 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HIPC.1998.737977","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The paper introduces an optimized mapping methodology for mapping instruction sequences (ISs) onto EPOM-processor arrays. The new features of this mapping methodology result from a systematic specification and exploitation of both instruction and processor level parallelism: ultra-low granularity of ISs requires an allocation and scheduling of individual instructions onto the given processor array. Moreover, this mapping methodology is complete in the sense that it considers both array bus-bandwidths and processor resource constraints. The mapping methodology is based on two concepts: 1) instruction sequences (ISs) which represent a generalized form of directed cyclic graphs (DCGs) and allow efficient specification of algorithm parallelism, and graph nodes represent instructions from the instruction set of a target processor architecture (J.P. Theis, 1997); 2) the EPOM-processor architecture which represents an optimized target VLIW processor architecture for parallel implementation of ISs (J.P. Theis and L. Thiele, 1996) and especially suited for parallel image/multimedia processing (J.P. Theis and L. Thiele, 1995). Special attention is paid to the optimization, of the mapping process of ISs onto EPOM-processor arrays. Algorithm execution time minimization is used as optimization goal. The mapping methodology is partially based on integer linear programming and heuristic techniques. The solution time complexity is substantially reduced by developing a two-phase hierarchical model, decoupling processor array allocation from subsequent scheduling. The efficiency of this mapping methodology was validated through experimental results on ISs of well known algorithm routines.
本文介绍了一种将指令序列映射到epom处理器阵列的优化映射方法。这种映射方法的新特性源于对指令级和处理器级并行性的系统规范和利用:超低粒度的si需要将单个指令分配和调度到给定的处理器阵列上。此外,这种映射方法是完整的,因为它考虑了阵列总线带宽和处理器资源约束。映射方法基于两个概念:1)指令序列(ISs),它表示有向循环图(dcg)的广义形式,并允许有效地规范算法并行性,图节点表示来自目标处理器架构的指令集的指令(J.P. Theis, 1997);2) epom处理器架构,它代表了一个优化的目标VLIW处理器架构,用于并行实现ISs (J.P. Theis and L. Thiele, 1996),特别适合并行图像/多媒体处理(J.P. Theis and L. Thiele, 1995)。特别注意了ISs到epom处理器阵列的映射过程的优化。以算法执行时间最小化为优化目标。映射方法部分基于整数线性规划和启发式技术。通过开发两阶段分层模型,将处理器阵列分配与后续调度解耦,大大降低了求解的时间复杂度。通过对已知算法例程的实验结果验证了该映射方法的有效性。