Lorenzo Carletti , Andrea Serafini , Gianluca Brilli , Alessandro Capotondi , Alessandro Biasci , Paolo Valente , Andrea Marongiu
{"title":"仔细研究商用多核soc中的内存干扰效应","authors":"Lorenzo Carletti , Andrea Serafini , Gianluca Brilli , Alessandro Capotondi , Alessandro Biasci , Paolo Valente , Andrea Marongiu","doi":"10.1016/j.sysarc.2025.103487","DOIUrl":null,"url":null,"abstract":"<div><div>Commercial-off-the-shelf (COTS) multicore systems on chip (SoC) represent a cheap and convenient solution for deploying sophisticated workloads in various application domains. The combination of several CPU cores and dedicated acceleration units tightly sharing memory and interconnect systems can provide tremendous peak performance, but also threatens timing predictability due to memory interference. Even when focusing on main CPU cores only, it has been reported that task slowdown due to memory interference can surpass 10<span><math><mo>×</mo></math></span>. Such poorly predictable timing behaviors bar greater adoption of COTS multicore SoCs in the domain of timing-critical applications, and motivate the wide activity of the research community to study solutions aimed at mitigating the problem. Understanding worst-case interference patterns on such hardware platforms is fundamental for building any effective memory interference control mechanism. A common assumption in the literature is that worst-case interference is generated by (and therefore assessed through) read-intensive synthetic workloads with 100% cache miss rate. Yet certain real-life workloads exhibit worse slowdown than what is generated under said assumed worst-case, so we study the interference effects of both synthetic and real-life benchmarks on different multicore SoCs. Our experiments indicate that cache thrashing causes the worst interference experienced by real-life benchmarks – due to their different usage of caches – and that there is no universal worst-case workload for every platform.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"167 ","pages":"Article 103487"},"PeriodicalIF":3.7000,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Taking a closer look at memory interference effects in commercial-off-the-shelf multicore SoCs\",\"authors\":\"Lorenzo Carletti , Andrea Serafini , Gianluca Brilli , Alessandro Capotondi , Alessandro Biasci , Paolo Valente , Andrea Marongiu\",\"doi\":\"10.1016/j.sysarc.2025.103487\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Commercial-off-the-shelf (COTS) multicore systems on chip (SoC) represent a cheap and convenient solution for deploying sophisticated workloads in various application domains. The combination of several CPU cores and dedicated acceleration units tightly sharing memory and interconnect systems can provide tremendous peak performance, but also threatens timing predictability due to memory interference. Even when focusing on main CPU cores only, it has been reported that task slowdown due to memory interference can surpass 10<span><math><mo>×</mo></math></span>. Such poorly predictable timing behaviors bar greater adoption of COTS multicore SoCs in the domain of timing-critical applications, and motivate the wide activity of the research community to study solutions aimed at mitigating the problem. Understanding worst-case interference patterns on such hardware platforms is fundamental for building any effective memory interference control mechanism. A common assumption in the literature is that worst-case interference is generated by (and therefore assessed through) read-intensive synthetic workloads with 100% cache miss rate. Yet certain real-life workloads exhibit worse slowdown than what is generated under said assumed worst-case, so we study the interference effects of both synthetic and real-life benchmarks on different multicore SoCs. Our experiments indicate that cache thrashing causes the worst interference experienced by real-life benchmarks – due to their different usage of caches – and that there is no universal worst-case workload for every platform.</div></div>\",\"PeriodicalId\":50027,\"journal\":{\"name\":\"Journal of Systems Architecture\",\"volume\":\"167 \",\"pages\":\"Article 103487\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Systems Architecture\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1383762125001596\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Architecture","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1383762125001596","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Taking a closer look at memory interference effects in commercial-off-the-shelf multicore SoCs
Commercial-off-the-shelf (COTS) multicore systems on chip (SoC) represent a cheap and convenient solution for deploying sophisticated workloads in various application domains. The combination of several CPU cores and dedicated acceleration units tightly sharing memory and interconnect systems can provide tremendous peak performance, but also threatens timing predictability due to memory interference. Even when focusing on main CPU cores only, it has been reported that task slowdown due to memory interference can surpass 10. Such poorly predictable timing behaviors bar greater adoption of COTS multicore SoCs in the domain of timing-critical applications, and motivate the wide activity of the research community to study solutions aimed at mitigating the problem. Understanding worst-case interference patterns on such hardware platforms is fundamental for building any effective memory interference control mechanism. A common assumption in the literature is that worst-case interference is generated by (and therefore assessed through) read-intensive synthetic workloads with 100% cache miss rate. Yet certain real-life workloads exhibit worse slowdown than what is generated under said assumed worst-case, so we study the interference effects of both synthetic and real-life benchmarks on different multicore SoCs. Our experiments indicate that cache thrashing causes the worst interference experienced by real-life benchmarks – due to their different usage of caches – and that there is no universal worst-case workload for every platform.
期刊介绍:
The Journal of Systems Architecture: Embedded Software Design (JSA) is a journal covering all design and architectural aspects related to embedded systems and software. It ranges from the microarchitecture level via the system software level up to the application-specific architecture level. Aspects such as real-time systems, operating systems, FPGA programming, programming languages, communications (limited to analysis and the software stack), mobile systems, parallel and distributed architectures as well as additional subjects in the computer and system architecture area will fall within the scope of this journal. Technology will not be a main focus, but its use and relevance to particular designs will be. Case studies are welcome but must contribute more than just a design for a particular piece of software.
Design automation of such systems including methodologies, techniques and tools for their design as well as novel designs of software components fall within the scope of this journal. Novel applications that use embedded systems are also central in this journal. While hardware is not a part of this journal hardware/software co-design methods that consider interplay between software and hardware components with and emphasis on software are also relevant here.