M. Valero-García, J. Navarro, J. Llabería, M. Valero
{"title":"使用流水线功能单元实现收缩算法","authors":"M. Valero-García, J. Navarro, J. Llabería, M. Valero","doi":"10.1109/ASAP.1990.145464","DOIUrl":null,"url":null,"abstract":"The authors present a method to implement systolic algorithms (SAs) using pipelined functional units (PFUs). This kind of unit makes it possible to improve the throughput of a processor because of the possibility of initiating a new operation before the previous one has been completed. The method permits transformation of a SA so that it can be efficiently executed using PFUs. The method is based on two temporal transformations (slowdown and retiming) and one spatial transformation (coalescing). The temporal transformations permit the modification of the SA in such a way that dependences established by the PFU are preserved. The spatial transformation improves the hardware utilization. The method was applied to 1-D SAs with data contraflow. To demonstrate the effectiveness of the method, the authors describe an efficient implementation of a non-time-homogeneous SA with data contraflow for QR decomposition.<<ETX>>","PeriodicalId":438078,"journal":{"name":"[1990] Proceedings of the International Conference on Application Specific Array Processors","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1990-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Implementation of systolic algorithms using pipelined functional units\",\"authors\":\"M. Valero-García, J. Navarro, J. Llabería, M. Valero\",\"doi\":\"10.1109/ASAP.1990.145464\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The authors present a method to implement systolic algorithms (SAs) using pipelined functional units (PFUs). This kind of unit makes it possible to improve the throughput of a processor because of the possibility of initiating a new operation before the previous one has been completed. The method permits transformation of a SA so that it can be efficiently executed using PFUs. The method is based on two temporal transformations (slowdown and retiming) and one spatial transformation (coalescing). The temporal transformations permit the modification of the SA in such a way that dependences established by the PFU are preserved. The spatial transformation improves the hardware utilization. The method was applied to 1-D SAs with data contraflow. To demonstrate the effectiveness of the method, the authors describe an efficient implementation of a non-time-homogeneous SA with data contraflow for QR decomposition.<<ETX>>\",\"PeriodicalId\":438078,\"journal\":{\"name\":\"[1990] Proceedings of the International Conference on Application Specific Array Processors\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1990-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[1990] Proceedings of the International Conference on Application Specific Array Processors\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASAP.1990.145464\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1990] Proceedings of the International Conference on Application Specific Array Processors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP.1990.145464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Implementation of systolic algorithms using pipelined functional units
The authors present a method to implement systolic algorithms (SAs) using pipelined functional units (PFUs). This kind of unit makes it possible to improve the throughput of a processor because of the possibility of initiating a new operation before the previous one has been completed. The method permits transformation of a SA so that it can be efficiently executed using PFUs. The method is based on two temporal transformations (slowdown and retiming) and one spatial transformation (coalescing). The temporal transformations permit the modification of the SA in such a way that dependences established by the PFU are preserved. The spatial transformation improves the hardware utilization. The method was applied to 1-D SAs with data contraflow. To demonstrate the effectiveness of the method, the authors describe an efficient implementation of a non-time-homogeneous SA with data contraflow for QR decomposition.<>