{"title":"Multiple instruction issue in the NonStop Cyclone processor","authors":"R. Horst, R. L. Harris, Robert L. Jardine","doi":"10.1145/325164.325147","DOIUrl":null,"url":null,"abstract":"The architecture for issuing multiple instructions per clock in the NonStop Cyclone processor is described. Pairs of instructions are fetched and decoded by a dual two-stage prefetch pipeline and passed to a dual six-stage pipeline for execution. Dynamic branch prediction is used to reduce branch penalties. A unique microcode routine for each pair is stored in the large duplexed control store. The microcode controls parallel data paths optimized for executing the most frequent instruction pairs. Other features of the architecture include cache support for unaligned double-precision accesses, a virtually addressed main memory, and a novel precise exception mechanism.<<ETX>>","PeriodicalId":297046,"journal":{"name":"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1990-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"65","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/325164.325147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 65
Abstract
The architecture for issuing multiple instructions per clock in the NonStop Cyclone processor is described. Pairs of instructions are fetched and decoded by a dual two-stage prefetch pipeline and passed to a dual six-stage pipeline for execution. Dynamic branch prediction is used to reduce branch penalties. A unique microcode routine for each pair is stored in the large duplexed control store. The microcode controls parallel data paths optimized for executing the most frequent instruction pairs. Other features of the architecture include cache support for unaligned double-precision accesses, a virtually addressed main memory, and a novel precise exception mechanism.<>