基于低质量指令预测的并发多线程处理器线程调度

The 3rd International IEEE-NEWCAS Conference, 2005. Pub Date : 2005-06-19 DOI:10.1109/NEWCAS.2005.1496686

H. Homayoun, K. Li, S. Rafatirad

{"title":"基于低质量指令预测的并发多线程处理器线程调度","authors":"H. Homayoun, K. Li, S. Rafatirad","doi":"10.1109/NEWCAS.2005.1496686","DOIUrl":null,"url":null,"abstract":"A simultaneous multithreaded (SMT) processor is capable of executing instructions from multiple threads in the same cycle. SMT in fact was introduced as a complementary architecture to superscalar to increase the throughput of the processor. Recently, several computer manufacturers have introduced their first generation SMT architecture. SMT permits multiple threads to compete simultaneously for shared resources. An example is the race for the fetch unit which is a critical logic responsible for thread scheduling decisions. When more threads than hardware execution contexts are available, the decision of choosing the best threads to fetch instructions from, will affect the processor's efficiency. In this paper the authors presented a new approach to choose the most useful threads among all available threads while they compete on a shared resource. The quality of instructions was identified based on the time they spend in the instruction queue. Low-quality instructions spend more time in the instruction queue. Accordingly threads with fewer number of low-quality instructions have a higher contribution to the entire processor throughput. In an experimental study, such low-quality instructions in each thread was identified to a maximum of 92% accuracy (average 72%). The authors exploited this to increase the overall processor throughput by giving higher priority to threads with lesser number of low-quality instructions. Overall an average of 11% performance improvement was achieved over the traditional algorithm that schedules threads in a round-robin fashion.","PeriodicalId":131387,"journal":{"name":"The 3rd International IEEE-NEWCAS Conference, 2005.","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Thread scheduling based on low-quality instruction prediction for simultaneous multithreaded processors\",\"authors\":\"H. Homayoun, K. Li, S. Rafatirad\",\"doi\":\"10.1109/NEWCAS.2005.1496686\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A simultaneous multithreaded (SMT) processor is capable of executing instructions from multiple threads in the same cycle. SMT in fact was introduced as a complementary architecture to superscalar to increase the throughput of the processor. Recently, several computer manufacturers have introduced their first generation SMT architecture. SMT permits multiple threads to compete simultaneously for shared resources. An example is the race for the fetch unit which is a critical logic responsible for thread scheduling decisions. When more threads than hardware execution contexts are available, the decision of choosing the best threads to fetch instructions from, will affect the processor's efficiency. In this paper the authors presented a new approach to choose the most useful threads among all available threads while they compete on a shared resource. The quality of instructions was identified based on the time they spend in the instruction queue. Low-quality instructions spend more time in the instruction queue. Accordingly threads with fewer number of low-quality instructions have a higher contribution to the entire processor throughput. In an experimental study, such low-quality instructions in each thread was identified to a maximum of 92% accuracy (average 72%). The authors exploited this to increase the overall processor throughput by giving higher priority to threads with lesser number of low-quality instructions. Overall an average of 11% performance improvement was achieved over the traditional algorithm that schedules threads in a round-robin fashion.\",\"PeriodicalId\":131387,\"journal\":{\"name\":\"The 3rd International IEEE-NEWCAS Conference, 2005.\",\"volume\":\"66 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 3rd International IEEE-NEWCAS Conference, 2005.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NEWCAS.2005.1496686\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 3rd International IEEE-NEWCAS Conference, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NEWCAS.2005.1496686","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

同步多线程(SMT)处理器能够在同一周期内执行来自多个线程的指令。SMT实际上是作为超标量的补充架构引入的，以提高处理器的吞吐量。最近，几家计算机制造商推出了他们的第一代SMT体系结构。SMT允许多个线程同时竞争共享资源。一个例子是获取单元的竞争，这是负责线程调度决策的关键逻辑。当可用的线程多于硬件执行上下文时，选择从哪个线程获取指令将影响处理器的效率。在本文中，作者提出了一种新的方法，当线程在共享资源上竞争时，从所有可用线程中选择最有用的线程。指令的质量是根据它们在指令队列中花费的时间来确定的。低质量的指令在指令队列中花费的时间更长。因此，低质量指令数量较少的线程对整个处理器吞吐量的贡献更高。在一项实验研究中，每个线程中的这种低质量指令的识别准确率最高为92%(平均为72%)。作者利用这一点，通过为具有较少低质量指令的线程提供更高的优先级来提高总体处理器吞吐量。总体而言，与以循环方式调度线程的传统算法相比，性能平均提高了11%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Thread scheduling based on low-quality instruction prediction for simultaneous multithreaded processors

A simultaneous multithreaded (SMT) processor is capable of executing instructions from multiple threads in the same cycle. SMT in fact was introduced as a complementary architecture to superscalar to increase the throughput of the processor. Recently, several computer manufacturers have introduced their first generation SMT architecture. SMT permits multiple threads to compete simultaneously for shared resources. An example is the race for the fetch unit which is a critical logic responsible for thread scheduling decisions. When more threads than hardware execution contexts are available, the decision of choosing the best threads to fetch instructions from, will affect the processor's efficiency. In this paper the authors presented a new approach to choose the most useful threads among all available threads while they compete on a shared resource. The quality of instructions was identified based on the time they spend in the instruction queue. Low-quality instructions spend more time in the instruction queue. Accordingly threads with fewer number of low-quality instructions have a higher contribution to the entire processor throughput. In an experimental study, such low-quality instructions in each thread was identified to a maximum of 92% accuracy (average 72%). The authors exploited this to increase the overall processor throughput by giving higher priority to threads with lesser number of low-quality instructions. Overall an average of 11% performance improvement was achieved over the traditional algorithm that schedules threads in a round-robin fashion.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

The 3rd International IEEE-NEWCAS Conference, 2005.

自引率

0.00%

发文量