Albert Esteve, Alberto Ros, A. Robles, M. E. Gómez, J. Duato
{"title":"TokenTLB: A Token-Based Page Classification Approach","authors":"Albert Esteve, Alberto Ros, A. Robles, M. E. Gómez, J. Duato","doi":"10.1145/2925426.2926280","DOIUrl":null,"url":null,"abstract":"Classifying memory accesses into private or shared data has become a fundamental approach to achieving efficiency and scalability in multi- and many-core systems. Since most memory accesses in both sequential and parallel applications are either private (accessed only by one core) or read-only (not written) data, devoting the full cost of coherence to every memory access results in sub-optimal performance and limits the scalability and efficiency of the multiprocessor. This work proposes TokenTLB, a page classification approach based on exchange and count of tokens. The key observation behind our proposal is that, opposed to coherence management, data classification meets all the benefits of a token-based approach without the burden of complex arbitration mechanisms, which has discouraged the implementation of token-based coherence protocols in commodity systems. Token counting on TLBs is a natural and efficient way for classifying memory pages. It does not require the use of complex and undesirable persistent requests or arbitration, since when two or more TLBs race for accessing a page, tokens are appropriately distributed classifying the page as shared. TokenTLB also favors shareability of translation information among TLBs, which improves system performance and constrains much of the TLB traffic compared to other broadcast-based approaches. It is achieved by requiring only TLBs holding extra tokens provide them along with the page translation (about one response per TLB miss). TokenTLB effectively increases blocks classified as private up to 61.1% while allowing read-only detection (24.4% shared-read-only blocks). When TokenTLB is applied to optimize the directory, it reduces the dynamic energy consumed by the cache hierarchy by nearly 27.3% over the baseline.","PeriodicalId":422112,"journal":{"name":"Proceedings of the 2016 International Conference on Supercomputing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 International Conference on Supercomputing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2925426.2926280","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Classifying memory accesses into private or shared data has become a fundamental approach to achieving efficiency and scalability in multi- and many-core systems. Since most memory accesses in both sequential and parallel applications are either private (accessed only by one core) or read-only (not written) data, devoting the full cost of coherence to every memory access results in sub-optimal performance and limits the scalability and efficiency of the multiprocessor. This work proposes TokenTLB, a page classification approach based on exchange and count of tokens. The key observation behind our proposal is that, opposed to coherence management, data classification meets all the benefits of a token-based approach without the burden of complex arbitration mechanisms, which has discouraged the implementation of token-based coherence protocols in commodity systems. Token counting on TLBs is a natural and efficient way for classifying memory pages. It does not require the use of complex and undesirable persistent requests or arbitration, since when two or more TLBs race for accessing a page, tokens are appropriately distributed classifying the page as shared. TokenTLB also favors shareability of translation information among TLBs, which improves system performance and constrains much of the TLB traffic compared to other broadcast-based approaches. It is achieved by requiring only TLBs holding extra tokens provide them along with the page translation (about one response per TLB miss). TokenTLB effectively increases blocks classified as private up to 61.1% while allowing read-only detection (24.4% shared-read-only blocks). When TokenTLB is applied to optimize the directory, it reduces the dynamic energy consumed by the cache hierarchy by nearly 27.3% over the baseline.