Mounir Bahtat, S. Belkouch, Philippe Elleaume, Phillipe Gall
{"title":"基于枚举的VLIW体系结构模调度启发式算法","authors":"Mounir Bahtat, S. Belkouch, Philippe Elleaume, Phillipe Gall","doi":"10.1109/ICM.2014.7071820","DOIUrl":null,"url":null,"abstract":"Modulo scheduling is a software pipelining technique exploiting instruction-level parallelism (ILP) of VLIW architectures to efficiently implement loops. This paper presents a novel enumeration-based resource-constrained heuristic for modulo scheduling. It takes into consideration the criticality of the nodes, generating near optimal schedules in terms of initiation intervals and register requirements. The scheduling algorithm outperformed better-known heuristics in terms of the quality of schedules, while presenting small compilation time enabling it to be used in a production environment. Experimental results on the VLIW TMS320C6678 DSP processor, showed improved performance on a signal processing set of algorithms.","PeriodicalId":107354,"journal":{"name":"2014 26th International Conference on Microelectronics (ICM)","volume":"252 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Fast enumeration-based modulo scheduling heuristic for VLIW architectures\",\"authors\":\"Mounir Bahtat, S. Belkouch, Philippe Elleaume, Phillipe Gall\",\"doi\":\"10.1109/ICM.2014.7071820\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modulo scheduling is a software pipelining technique exploiting instruction-level parallelism (ILP) of VLIW architectures to efficiently implement loops. This paper presents a novel enumeration-based resource-constrained heuristic for modulo scheduling. It takes into consideration the criticality of the nodes, generating near optimal schedules in terms of initiation intervals and register requirements. The scheduling algorithm outperformed better-known heuristics in terms of the quality of schedules, while presenting small compilation time enabling it to be used in a production environment. Experimental results on the VLIW TMS320C6678 DSP processor, showed improved performance on a signal processing set of algorithms.\",\"PeriodicalId\":107354,\"journal\":{\"name\":\"2014 26th International Conference on Microelectronics (ICM)\",\"volume\":\"252 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 26th International Conference on Microelectronics (ICM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICM.2014.7071820\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 26th International Conference on Microelectronics (ICM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICM.2014.7071820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fast enumeration-based modulo scheduling heuristic for VLIW architectures
Modulo scheduling is a software pipelining technique exploiting instruction-level parallelism (ILP) of VLIW architectures to efficiently implement loops. This paper presents a novel enumeration-based resource-constrained heuristic for modulo scheduling. It takes into consideration the criticality of the nodes, generating near optimal schedules in terms of initiation intervals and register requirements. The scheduling algorithm outperformed better-known heuristics in terms of the quality of schedules, while presenting small compilation time enabling it to be used in a production environment. Experimental results on the VLIW TMS320C6678 DSP processor, showed improved performance on a signal processing set of algorithms.