{"title":"使用基于硬件的主动推送技术来容忍内存延迟","authors":"Liwen Shi, Xiaoya Fan, Jing Chen, Xiaoping Huang, Hangpei Tian","doi":"10.1109/ICESS.2009.65","DOIUrl":null,"url":null,"abstract":"The pre-sending technique, proposed from distributed shared memory systems, pushes data to cache instead of pulling,aiming at reducing the traffic of communication. On a purpose of effectively improving cache hit ratio, this paper proposes a hardware-based active-pushing technique, which directs data owners like lower-level of memory hierarchy to actively push the predicted data at the right moment to a upper level, which is closer to the CPU, therefore achieving the object of reducing memory stall time. Again, a further optimization aimed at the timeliness of active-pushing technique is introduced. The prefetching, pre-sending, active-pushing and optimized active-pushing technique are, respectively, simulated upon the microprocessor simulation platform of \"Longtium\" R2. Experimenting results show that both the active-pushing technique and the optimized one improve cache hit ratio significantly compared with the rest.","PeriodicalId":335217,"journal":{"name":"2009 International Conference on Embedded Software and Systems","volume":"342 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tolerating Memory Latency Using a Hardware-Based Active-Pushing Technique\",\"authors\":\"Liwen Shi, Xiaoya Fan, Jing Chen, Xiaoping Huang, Hangpei Tian\",\"doi\":\"10.1109/ICESS.2009.65\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The pre-sending technique, proposed from distributed shared memory systems, pushes data to cache instead of pulling,aiming at reducing the traffic of communication. On a purpose of effectively improving cache hit ratio, this paper proposes a hardware-based active-pushing technique, which directs data owners like lower-level of memory hierarchy to actively push the predicted data at the right moment to a upper level, which is closer to the CPU, therefore achieving the object of reducing memory stall time. Again, a further optimization aimed at the timeliness of active-pushing technique is introduced. The prefetching, pre-sending, active-pushing and optimized active-pushing technique are, respectively, simulated upon the microprocessor simulation platform of \\\"Longtium\\\" R2. Experimenting results show that both the active-pushing technique and the optimized one improve cache hit ratio significantly compared with the rest.\",\"PeriodicalId\":335217,\"journal\":{\"name\":\"2009 International Conference on Embedded Software and Systems\",\"volume\":\"342 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 International Conference on Embedded Software and Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICESS.2009.65\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Embedded Software and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICESS.2009.65","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tolerating Memory Latency Using a Hardware-Based Active-Pushing Technique
The pre-sending technique, proposed from distributed shared memory systems, pushes data to cache instead of pulling,aiming at reducing the traffic of communication. On a purpose of effectively improving cache hit ratio, this paper proposes a hardware-based active-pushing technique, which directs data owners like lower-level of memory hierarchy to actively push the predicted data at the right moment to a upper level, which is closer to the CPU, therefore achieving the object of reducing memory stall time. Again, a further optimization aimed at the timeliness of active-pushing technique is introduced. The prefetching, pre-sending, active-pushing and optimized active-pushing technique are, respectively, simulated upon the microprocessor simulation platform of "Longtium" R2. Experimenting results show that both the active-pushing technique and the optimized one improve cache hit ratio significantly compared with the rest.