{"title":"通过评估PDES性能来提高系统级模型的并行性","authors":"E. Arasteh, R. Dömer","doi":"10.1109/FDL53530.2021.9568385","DOIUrl":null,"url":null,"abstract":"For effective embedded system design, transaction level modeling (TLM) must explicitly expose any available parallelism in the application. Traditional TLM in SystemC utilizes channels for communication and synchronization between concurrent modules, whereas modern TLM-2.0 emphasizes address-accurate communication via explicit interconnect and memories. In both modeling styles, the choice of synchronization mechanisms has a significant impact on the available parallelism in the model which can be exploited by parallel discrete event simulation (PDES). In this work, we propose and analyze a set of non-invasive standard-compliant modeling techniques to increase parallelism in IEEE SystemC TLM-1 and TLM-2.0 models. We measure the performance of aggressive out-of-order PDES in the Recoding Infrastructure for SystemC (RISC) and analyze the parallelism in the models. Our case study on six modeling styles of a state-of-art deep neural network (DNN), namely the GoogLeNet image classification algorithm, demonstrates the impact of varying synchronization mechanisms with simulator run time reduced by 38% compared to a synchronous parallel reference model on a 16-core host machine. Our study also suggests that increased parallel simulation performance indicates better models with higher amounts of parallelism exposed.","PeriodicalId":114039,"journal":{"name":"2021 Forum on specification & Design Languages (FDL)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Improving Parallelism in System Level Models by Assessing PDES Performance\",\"authors\":\"E. Arasteh, R. Dömer\",\"doi\":\"10.1109/FDL53530.2021.9568385\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For effective embedded system design, transaction level modeling (TLM) must explicitly expose any available parallelism in the application. Traditional TLM in SystemC utilizes channels for communication and synchronization between concurrent modules, whereas modern TLM-2.0 emphasizes address-accurate communication via explicit interconnect and memories. In both modeling styles, the choice of synchronization mechanisms has a significant impact on the available parallelism in the model which can be exploited by parallel discrete event simulation (PDES). In this work, we propose and analyze a set of non-invasive standard-compliant modeling techniques to increase parallelism in IEEE SystemC TLM-1 and TLM-2.0 models. We measure the performance of aggressive out-of-order PDES in the Recoding Infrastructure for SystemC (RISC) and analyze the parallelism in the models. Our case study on six modeling styles of a state-of-art deep neural network (DNN), namely the GoogLeNet image classification algorithm, demonstrates the impact of varying synchronization mechanisms with simulator run time reduced by 38% compared to a synchronous parallel reference model on a 16-core host machine. Our study also suggests that increased parallel simulation performance indicates better models with higher amounts of parallelism exposed.\",\"PeriodicalId\":114039,\"journal\":{\"name\":\"2021 Forum on specification & Design Languages (FDL)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Forum on specification & Design Languages (FDL)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FDL53530.2021.9568385\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Forum on specification & Design Languages (FDL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FDL53530.2021.9568385","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
摘要
为了有效地进行嵌入式系统设计,事务级建模(TLM)必须显式地公开应用程序中任何可用的并行性。SystemC中的传统TLM利用通道在并发模块之间进行通信和同步,而现代TLM-2.0强调通过显式互连和存储器进行地址精确通信。在这两种建模风格中,同步机制的选择对模型中的可用并行性有重大影响,并行离散事件仿真(PDES)可以利用这些并行性。在这项工作中,我们提出并分析了一套非侵入性的标准兼容建模技术,以增加IEEE SystemC TLM-1和TLM-2.0模型的并行性。我们在RISC (Recoding Infrastructure for SystemC)中测量了主动乱序PDES的性能,并分析了模型的并行性。我们对最先进的深度神经网络(DNN)的六种建模风格(即GoogLeNet图像分类算法)的案例研究表明,与16核主机上的同步并行参考模型相比,不同同步机制对模拟器运行时间的影响减少了38%。我们的研究还表明,增加的并行模拟性能表明了更高的并行性暴露的更好的模型。
Improving Parallelism in System Level Models by Assessing PDES Performance
For effective embedded system design, transaction level modeling (TLM) must explicitly expose any available parallelism in the application. Traditional TLM in SystemC utilizes channels for communication and synchronization between concurrent modules, whereas modern TLM-2.0 emphasizes address-accurate communication via explicit interconnect and memories. In both modeling styles, the choice of synchronization mechanisms has a significant impact on the available parallelism in the model which can be exploited by parallel discrete event simulation (PDES). In this work, we propose and analyze a set of non-invasive standard-compliant modeling techniques to increase parallelism in IEEE SystemC TLM-1 and TLM-2.0 models. We measure the performance of aggressive out-of-order PDES in the Recoding Infrastructure for SystemC (RISC) and analyze the parallelism in the models. Our case study on six modeling styles of a state-of-art deep neural network (DNN), namely the GoogLeNet image classification algorithm, demonstrates the impact of varying synchronization mechanisms with simulator run time reduced by 38% compared to a synchronous parallel reference model on a 16-core host machine. Our study also suggests that increased parallel simulation performance indicates better models with higher amounts of parallelism exposed.