A Fast Trace Aware Statistical Based Prediction Model with Burst Traffic Modeling for Contention Stall in A Priority Based MPSoC Bus

Q4 Engineering
F. Shafiq, T. Isshiki, Dongju Li, H. Kunieda
{"title":"A Fast Trace Aware Statistical Based Prediction Model with Burst Traffic Modeling for Contention Stall in A Priority Based MPSoC Bus","authors":"F. Shafiq, T. Isshiki, Dongju Li, H. Kunieda","doi":"10.2197/ipsjtsldm.9.37","DOIUrl":null,"url":null,"abstract":": While Multiprocessor System-On-Chips (MPSoCs) are becoming widely adopted in embedded systems, communication architecture analysis for MPSoCs becomes ever more complex. There is a growing need for faster and accurate performance estimation techniques for on-chip bus architecture. This paper presents a novel fast statis- tical based bus stall prediction model that enables estimating the e ff ects of bus-contention stall on the cycle-count of an application program on a subject MPSoC architecture. Our technique fills the gap in existing techniques for bus performance estimation, that are either not accurate enough (e.g. static techniques) or too slow to be used in iterative analysis (e.g. cycle by cycle arbitration simulation on every bus access). First we formulate a model named “single blocking model” that models blocking of a single bus request due to a single bus transfer on another bus master at a time. Furthermore we augment this model with a “burst blocking model” that models bus stall incurred due to burst bus transfers. Together these two models give us a very fast way to predict bus stalls on an MPSoC bus. It is as-sumed that each Processor in the system has a distinct fixed priority, and arbitration is based on priority. The proposed technique makes use of accumulated “workload statistics” to accurately predict the “stall cycle counts” caused due to bus contention. This eliminates the need to simulate arbitration on every bus access, resulting in substantial speed-up. Proposed technique is verified by experiments on applications such as “synthetic tra ffi c generators”, “Newton-Euler dynamic control calculation for the 6-degrees-of-freedom Stanford manipulator benchmark”, “Random sparse matrix solver for electronic circuit simulations benchmark”, “Fast Fourier Transform with 1024 inputs of complex numbers” and “SPEC95 Fpppp which is a chemical program performing multi-electron integral derivatives”. Experimental re- sults show that the proposed method delivers a speed-up factor of 1.33, 1.7, 74 and 6 against the simulation method for the four benchmark applications respectively, while average estimation error is 7% for benchmark application, “Fast Fourier Transform with 1024 inputs of complex numbers” and under 1% for other benchmarks.","PeriodicalId":38964,"journal":{"name":"IPSJ Transactions on System LSI Design Methodology","volume":"46 1","pages":"37-48"},"PeriodicalIF":0.0000,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IPSJ Transactions on System LSI Design Methodology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2197/ipsjtsldm.9.37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 1

Abstract

: While Multiprocessor System-On-Chips (MPSoCs) are becoming widely adopted in embedded systems, communication architecture analysis for MPSoCs becomes ever more complex. There is a growing need for faster and accurate performance estimation techniques for on-chip bus architecture. This paper presents a novel fast statis- tical based bus stall prediction model that enables estimating the e ff ects of bus-contention stall on the cycle-count of an application program on a subject MPSoC architecture. Our technique fills the gap in existing techniques for bus performance estimation, that are either not accurate enough (e.g. static techniques) or too slow to be used in iterative analysis (e.g. cycle by cycle arbitration simulation on every bus access). First we formulate a model named “single blocking model” that models blocking of a single bus request due to a single bus transfer on another bus master at a time. Furthermore we augment this model with a “burst blocking model” that models bus stall incurred due to burst bus transfers. Together these two models give us a very fast way to predict bus stalls on an MPSoC bus. It is as-sumed that each Processor in the system has a distinct fixed priority, and arbitration is based on priority. The proposed technique makes use of accumulated “workload statistics” to accurately predict the “stall cycle counts” caused due to bus contention. This eliminates the need to simulate arbitration on every bus access, resulting in substantial speed-up. Proposed technique is verified by experiments on applications such as “synthetic tra ffi c generators”, “Newton-Euler dynamic control calculation for the 6-degrees-of-freedom Stanford manipulator benchmark”, “Random sparse matrix solver for electronic circuit simulations benchmark”, “Fast Fourier Transform with 1024 inputs of complex numbers” and “SPEC95 Fpppp which is a chemical program performing multi-electron integral derivatives”. Experimental re- sults show that the proposed method delivers a speed-up factor of 1.33, 1.7, 74 and 6 against the simulation method for the four benchmark applications respectively, while average estimation error is 7% for benchmark application, “Fast Fourier Transform with 1024 inputs of complex numbers” and under 1% for other benchmarks.
基于优先级的MPSoC总线竞争失速突发流量建模的快速跟踪感知统计预测模型
随着多处理器片上系统(mpsoc)在嵌入式系统中的广泛应用,mpsoc的通信体系结构分析变得越来越复杂。对于片上总线体系结构,越来越需要更快、更准确的性能评估技术。本文提出了一种新的基于统计的总线失速预测模型,该模型能够估计总线争用失速对特定MPSoC架构下应用程序周期数的影响。我们的技术填补了现有总线性能评估技术的空白,这些技术要么不够准确(例如静态技术),要么太慢,无法用于迭代分析(例如,在每个总线访问上逐个周期仲裁模拟)。首先,我们制定了一个名为“单阻塞模型”的模型,该模型模拟了由于在另一个总线主上进行单个总线传输而导致的单个总线请求阻塞。此外,我们用“突发阻塞模型”对该模型进行了扩充,该模型对突发公交换乘引起的公交失速进行了建模。这两个模型一起为我们提供了一种非常快速的方法来预测MPSoC总线上的总线失速。假设系统中的每个处理器都有不同的固定优先级,并且仲裁是基于优先级的。所提出的技术利用累积的“工作负载统计信息”来准确预测由于总线争用而导致的“失速周期计数”。这消除了在每个总线访问上模拟仲裁的需要,从而大大提高了速度。在“合成流量发生器”、“6自由度斯坦福操纵器基准的牛顿-欧拉动态控制计算”、“电子电路模拟基准的随机稀疏矩阵求解器”、“1024个复数输入的快速傅立叶变换”和“执行多电子积分导数的化学程序SPEC95 Fpppp”等应用中进行了实验验证。实验结果表明,该方法在4种基准应用中分别比仿真方法加速系数为1.33、1.7、74和6,在“1024个复数输入的快速傅里叶变换”基准应用中平均估计误差为7%,在其他基准测试中平均估计误差小于1%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IPSJ Transactions on System LSI Design Methodology
IPSJ Transactions on System LSI Design Methodology Engineering-Electrical and Electronic Engineering
CiteScore
1.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信