{"title":"Scheduling divisible loads on partially reconfigurable hardware","authors":"K. Vikram, V. Vasudevan","doi":"10.1109/FCCM.2006.63","DOIUrl":null,"url":null,"abstract":"For a task mapped to the reconfigurable fabric (RF) of partially reconfigurable hybrid processor architecture, significant speedup can be obtained if multiple processing units (PUs) are used to accelerate the task. In this paper, the authors present the results obtained from a quantitative analysis for a single data-parallel task mapped to the RF of bus-based hybrid processor architecture. The architectural constraints in this case include run-time reconfiguration delay and a shared data bus to main memory","PeriodicalId":123057,"journal":{"name":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"386 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2006.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
For a task mapped to the reconfigurable fabric (RF) of partially reconfigurable hybrid processor architecture, significant speedup can be obtained if multiple processing units (PUs) are used to accelerate the task. In this paper, the authors present the results obtained from a quantitative analysis for a single data-parallel task mapped to the RF of bus-based hybrid processor architecture. The architectural constraints in this case include run-time reconfiguration delay and a shared data bus to main memory
对于映射到部分可重构混合处理器架构的可重构结构(reconfigurable fabric, RF)上的任务,如果使用多个处理单元(processing unit, pu)对任务进行加速,可以获得显著的加速效果。本文给出了基于总线的混合处理器体系结构中单个数据并行任务映射到射频的定量分析结果。这种情况下的体系结构约束包括运行时重新配置延迟和到主存的共享数据总线