{"title":"基于AltiVec技术的PowerPC G4处理器垂直和水平波束形成内核优化","authors":"Y.H. Cho, D. Brunke, G. E. Allen, B. Evans","doi":"10.1109/ACSSC.2000.911273","DOIUrl":null,"url":null,"abstract":"Three-dimensional real-time digital sonar beamforming requires 4 to 12 GFLOPS, 1 to 2 GB of memory, and about 100 MB/s of I/O bandwidth. G.E. Allen and B.L. Evans have implemented a 4-GFLOP sonar beamformer in real-time on a Sun UltraSPARC II server with 16 333-MHz processors by utilizing the Visual Instruction Set (VIS) single-instruction multiple-data (SIMD) extensions. In this paper, we rewrite the horizontal and vertical beamforming kernels to use AltiVec SIMD extension for the PowerPC. AltiVec can execute up to four 32-bit floating-point multiply and accumulate (MAC) operations per instruction. In the PowerPC implementation, we prefetch and realign data for the I28-bit SIMD registers of AltiVec. We evaluate the performance of these beamforming kernels on the PowerPC and the UltraSPARC-II to evaluate the impact of the compiler, SIMD word alignment, and cache block alignment on performance.","PeriodicalId":10581,"journal":{"name":"Conference Record of the Thirty-Fourth Asilomar Conference on Signals, Systems and Computers (Cat. No.00CH37154)","volume":"39 1","pages":"1670-1674 vol.2"},"PeriodicalIF":0.0000,"publicationDate":"2000-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Optimization of vertical and horizontal beamforming kernels on the PowerPC G4 processor with AltiVec technology\",\"authors\":\"Y.H. Cho, D. Brunke, G. E. Allen, B. Evans\",\"doi\":\"10.1109/ACSSC.2000.911273\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Three-dimensional real-time digital sonar beamforming requires 4 to 12 GFLOPS, 1 to 2 GB of memory, and about 100 MB/s of I/O bandwidth. G.E. Allen and B.L. Evans have implemented a 4-GFLOP sonar beamformer in real-time on a Sun UltraSPARC II server with 16 333-MHz processors by utilizing the Visual Instruction Set (VIS) single-instruction multiple-data (SIMD) extensions. In this paper, we rewrite the horizontal and vertical beamforming kernels to use AltiVec SIMD extension for the PowerPC. AltiVec can execute up to four 32-bit floating-point multiply and accumulate (MAC) operations per instruction. In the PowerPC implementation, we prefetch and realign data for the I28-bit SIMD registers of AltiVec. We evaluate the performance of these beamforming kernels on the PowerPC and the UltraSPARC-II to evaluate the impact of the compiler, SIMD word alignment, and cache block alignment on performance.\",\"PeriodicalId\":10581,\"journal\":{\"name\":\"Conference Record of the Thirty-Fourth Asilomar Conference on Signals, Systems and Computers (Cat. No.00CH37154)\",\"volume\":\"39 1\",\"pages\":\"1670-1674 vol.2\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference Record of the Thirty-Fourth Asilomar Conference on Signals, Systems and Computers (Cat. No.00CH37154)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACSSC.2000.911273\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference Record of the Thirty-Fourth Asilomar Conference on Signals, Systems and Computers (Cat. No.00CH37154)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSSC.2000.911273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimization of vertical and horizontal beamforming kernels on the PowerPC G4 processor with AltiVec technology
Three-dimensional real-time digital sonar beamforming requires 4 to 12 GFLOPS, 1 to 2 GB of memory, and about 100 MB/s of I/O bandwidth. G.E. Allen and B.L. Evans have implemented a 4-GFLOP sonar beamformer in real-time on a Sun UltraSPARC II server with 16 333-MHz processors by utilizing the Visual Instruction Set (VIS) single-instruction multiple-data (SIMD) extensions. In this paper, we rewrite the horizontal and vertical beamforming kernels to use AltiVec SIMD extension for the PowerPC. AltiVec can execute up to four 32-bit floating-point multiply and accumulate (MAC) operations per instruction. In the PowerPC implementation, we prefetch and realign data for the I28-bit SIMD registers of AltiVec. We evaluate the performance of these beamforming kernels on the PowerPC and the UltraSPARC-II to evaluate the impact of the compiler, SIMD word alignment, and cache block alignment on performance.