Shih-Kai Lin, Ding-Yong Hong, Sheng-Yu Fu, Jan-Jan Wu, W. Hsu
{"title":"使用受限事务性内存的应用程序的动态调优","authors":"Shih-Kai Lin, Ding-Yong Hong, Sheng-Yu Fu, Jan-Jan Wu, W. Hsu","doi":"10.1145/3264746.3264789","DOIUrl":null,"url":null,"abstract":"Transactional Synchronization Extensions (TSX) support for hardware Transactional Memory (TM) on Intel 4th generation Core processors. Two programming interfaces, Hardware Lock Elision (HLE) and Restricted Transactional Memory (RTM), are provided to support software development using TSX. HLE is easy to use and maintains backward compatible with processors without TSX support while RTM is more flexible and scalable. Previous researches have shown that critical sections protected by RTM with a well-designed retry mechanism as its fallback code path can often achieve better performance than HLE. More parallel programs may be programmed in HLE, however, using RTM may obtain greater performance. To embrace both productivity and high performance of parallel program with TSX, we present a framework built on QEMU that can dynamically transform HLE instructions in an application binary to fragments of RTM codes with adaptive tuning on the fly. Compared to HLE execution, our prototype achieves 1.56x speedup with 8 threads on average. Due to the scalability of RTM, the speedup will be more significant as the number of threads increases.","PeriodicalId":186790,"journal":{"name":"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dynamic tuning of applications using restricted transactional memory\",\"authors\":\"Shih-Kai Lin, Ding-Yong Hong, Sheng-Yu Fu, Jan-Jan Wu, W. Hsu\",\"doi\":\"10.1145/3264746.3264789\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transactional Synchronization Extensions (TSX) support for hardware Transactional Memory (TM) on Intel 4th generation Core processors. Two programming interfaces, Hardware Lock Elision (HLE) and Restricted Transactional Memory (RTM), are provided to support software development using TSX. HLE is easy to use and maintains backward compatible with processors without TSX support while RTM is more flexible and scalable. Previous researches have shown that critical sections protected by RTM with a well-designed retry mechanism as its fallback code path can often achieve better performance than HLE. More parallel programs may be programmed in HLE, however, using RTM may obtain greater performance. To embrace both productivity and high performance of parallel program with TSX, we present a framework built on QEMU that can dynamically transform HLE instructions in an application binary to fragments of RTM codes with adaptive tuning on the fly. Compared to HLE execution, our prototype achieves 1.56x speedup with 8 threads on average. Due to the scalability of RTM, the speedup will be more significant as the number of threads increases.\",\"PeriodicalId\":186790,\"journal\":{\"name\":\"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3264746.3264789\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3264746.3264789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dynamic tuning of applications using restricted transactional memory
Transactional Synchronization Extensions (TSX) support for hardware Transactional Memory (TM) on Intel 4th generation Core processors. Two programming interfaces, Hardware Lock Elision (HLE) and Restricted Transactional Memory (RTM), are provided to support software development using TSX. HLE is easy to use and maintains backward compatible with processors without TSX support while RTM is more flexible and scalable. Previous researches have shown that critical sections protected by RTM with a well-designed retry mechanism as its fallback code path can often achieve better performance than HLE. More parallel programs may be programmed in HLE, however, using RTM may obtain greater performance. To embrace both productivity and high performance of parallel program with TSX, we present a framework built on QEMU that can dynamically transform HLE instructions in an application binary to fragments of RTM codes with adaptive tuning on the fly. Compared to HLE execution, our prototype achieves 1.56x speedup with 8 threads on average. Due to the scalability of RTM, the speedup will be more significant as the number of threads increases.