{"title":"通过利用页面翻译中的集群来增加TLB覆盖范围","authors":"B. Pham, A. Bhattacharjee, Yasuko Eckert, G. Loh","doi":"10.1109/HPCA.2014.6835964","DOIUrl":null,"url":null,"abstract":"The steadily increasing sizes of main memory capacities require corresponding increases in the processor's translation lookaside buffer (TLB) resources to avoid performance bottlenecks. Large operating system page sizes can mitigate the bottleneck with a smaller TLB, but most OSs and applications do not fully utilize the large-page support in current hardware. Recent work has shown that, while not guaranteed, some virtual-to-physical page mappings exhibit “contiguous” spatial locality in which consecutive virtual pages map to consecutive physical pages. Such locality provides opportunities to coalesce “adjacent” TLB entries for increased reach. We observe that beyond simple adjacent-entry coalescing, many more translations exhibit “clustered” spatial locality in which a group or cluster of nearby virtual pages map to a similarly clustered set of physical pages. In this work, we provide a detailed characterization of the spatial locality among the virtual-to-physical translations. Based on this characterization, we present a multi-granular TLB organization that significantly increases its effective reach and reduces miss rates substantially while requiring no additional OS support. Our evaluation shows that the multi-granular design outperforms conventional TLBs and the recently proposed coalesced TLBs technique.","PeriodicalId":164587,"journal":{"name":"2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)","volume":"473 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"132","resultStr":"{\"title\":\"Increasing TLB reach by exploiting clustering in page translations\",\"authors\":\"B. Pham, A. Bhattacharjee, Yasuko Eckert, G. Loh\",\"doi\":\"10.1109/HPCA.2014.6835964\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The steadily increasing sizes of main memory capacities require corresponding increases in the processor's translation lookaside buffer (TLB) resources to avoid performance bottlenecks. Large operating system page sizes can mitigate the bottleneck with a smaller TLB, but most OSs and applications do not fully utilize the large-page support in current hardware. Recent work has shown that, while not guaranteed, some virtual-to-physical page mappings exhibit “contiguous” spatial locality in which consecutive virtual pages map to consecutive physical pages. Such locality provides opportunities to coalesce “adjacent” TLB entries for increased reach. We observe that beyond simple adjacent-entry coalescing, many more translations exhibit “clustered” spatial locality in which a group or cluster of nearby virtual pages map to a similarly clustered set of physical pages. In this work, we provide a detailed characterization of the spatial locality among the virtual-to-physical translations. Based on this characterization, we present a multi-granular TLB organization that significantly increases its effective reach and reduces miss rates substantially while requiring no additional OS support. Our evaluation shows that the multi-granular design outperforms conventional TLBs and the recently proposed coalesced TLBs technique.\",\"PeriodicalId\":164587,\"journal\":{\"name\":\"2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)\",\"volume\":\"473 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"132\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCA.2014.6835964\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2014.6835964","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Increasing TLB reach by exploiting clustering in page translations
The steadily increasing sizes of main memory capacities require corresponding increases in the processor's translation lookaside buffer (TLB) resources to avoid performance bottlenecks. Large operating system page sizes can mitigate the bottleneck with a smaller TLB, but most OSs and applications do not fully utilize the large-page support in current hardware. Recent work has shown that, while not guaranteed, some virtual-to-physical page mappings exhibit “contiguous” spatial locality in which consecutive virtual pages map to consecutive physical pages. Such locality provides opportunities to coalesce “adjacent” TLB entries for increased reach. We observe that beyond simple adjacent-entry coalescing, many more translations exhibit “clustered” spatial locality in which a group or cluster of nearby virtual pages map to a similarly clustered set of physical pages. In this work, we provide a detailed characterization of the spatial locality among the virtual-to-physical translations. Based on this characterization, we present a multi-granular TLB organization that significantly increases its effective reach and reduces miss rates substantially while requiring no additional OS support. Our evaluation shows that the multi-granular design outperforms conventional TLBs and the recently proposed coalesced TLBs technique.