{"title":"轻量级重用-距离测量","authors":"Qingsen Wang, Xu Liu, Milind Chabbi","doi":"10.1109/HPCA.2019.00056","DOIUrl":null,"url":null,"abstract":"Data locality has a profound impact on program performance. Reuse distance—the number of distinct memory locations accessed between two consecutive accesses to the same location—is the de facto, machine-independent metric of data locality in a program. Reuse distance measurement, typically, requires exhaustive instrumentation (code or binary) to log every memory access, which results in orders of magnitude runtime slowdown and memory bloat. Such high overheads impede reuse distance tools from adoption in long-running, production applications despite their usefulness. We develop RDX, a lightweight profiling tool for characterizing reuse distance in an execution; RDX typically incurs negligible time (5%) and memory (7%) overheads. RDX performs no instrumentation whatsoever but uniquely combines hardware performance counter sampling with hardware debug registers, both available in commodity CPU processors, to produce reuse-distance histograms. RDX typically has more than 90% accuracy compared to the ground truth. With the help of RDX, we are the first to characterize memory performance of long-running SPEC CPU2017 benchmarks. Keywords-Reuse distance; locality; hardware performance counters; debug registers; profiling.","PeriodicalId":102050,"journal":{"name":"2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Featherlight Reuse-Distance Measurement\",\"authors\":\"Qingsen Wang, Xu Liu, Milind Chabbi\",\"doi\":\"10.1109/HPCA.2019.00056\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data locality has a profound impact on program performance. Reuse distance—the number of distinct memory locations accessed between two consecutive accesses to the same location—is the de facto, machine-independent metric of data locality in a program. Reuse distance measurement, typically, requires exhaustive instrumentation (code or binary) to log every memory access, which results in orders of magnitude runtime slowdown and memory bloat. Such high overheads impede reuse distance tools from adoption in long-running, production applications despite their usefulness. We develop RDX, a lightweight profiling tool for characterizing reuse distance in an execution; RDX typically incurs negligible time (5%) and memory (7%) overheads. RDX performs no instrumentation whatsoever but uniquely combines hardware performance counter sampling with hardware debug registers, both available in commodity CPU processors, to produce reuse-distance histograms. RDX typically has more than 90% accuracy compared to the ground truth. With the help of RDX, we are the first to characterize memory performance of long-running SPEC CPU2017 benchmarks. Keywords-Reuse distance; locality; hardware performance counters; debug registers; profiling.\",\"PeriodicalId\":102050,\"journal\":{\"name\":\"2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCA.2019.00056\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2019.00056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data locality has a profound impact on program performance. Reuse distance—the number of distinct memory locations accessed between two consecutive accesses to the same location—is the de facto, machine-independent metric of data locality in a program. Reuse distance measurement, typically, requires exhaustive instrumentation (code or binary) to log every memory access, which results in orders of magnitude runtime slowdown and memory bloat. Such high overheads impede reuse distance tools from adoption in long-running, production applications despite their usefulness. We develop RDX, a lightweight profiling tool for characterizing reuse distance in an execution; RDX typically incurs negligible time (5%) and memory (7%) overheads. RDX performs no instrumentation whatsoever but uniquely combines hardware performance counter sampling with hardware debug registers, both available in commodity CPU processors, to produce reuse-distance histograms. RDX typically has more than 90% accuracy compared to the ground truth. With the help of RDX, we are the first to characterize memory performance of long-running SPEC CPU2017 benchmarks. Keywords-Reuse distance; locality; hardware performance counters; debug registers; profiling.