Optimization of vertical and horizontal beamforming kernels on the PowerPC G4 processor with AltiVec technology

Y.H. Cho, D. Brunke, G. E. Allen, B. Evans
{"title":"Optimization of vertical and horizontal beamforming kernels on the PowerPC G4 processor with AltiVec technology","authors":"Y.H. Cho, D. Brunke, G. E. Allen, B. Evans","doi":"10.1109/ACSSC.2000.911273","DOIUrl":null,"url":null,"abstract":"Three-dimensional real-time digital sonar beamforming requires 4 to 12 GFLOPS, 1 to 2 GB of memory, and about 100 MB/s of I/O bandwidth. G.E. Allen and B.L. Evans have implemented a 4-GFLOP sonar beamformer in real-time on a Sun UltraSPARC II server with 16 333-MHz processors by utilizing the Visual Instruction Set (VIS) single-instruction multiple-data (SIMD) extensions. In this paper, we rewrite the horizontal and vertical beamforming kernels to use AltiVec SIMD extension for the PowerPC. AltiVec can execute up to four 32-bit floating-point multiply and accumulate (MAC) operations per instruction. In the PowerPC implementation, we prefetch and realign data for the I28-bit SIMD registers of AltiVec. We evaluate the performance of these beamforming kernels on the PowerPC and the UltraSPARC-II to evaluate the impact of the compiler, SIMD word alignment, and cache block alignment on performance.","PeriodicalId":10581,"journal":{"name":"Conference Record of the Thirty-Fourth Asilomar Conference on Signals, Systems and Computers (Cat. No.00CH37154)","volume":"39 1","pages":"1670-1674 vol.2"},"PeriodicalIF":0.0000,"publicationDate":"2000-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference Record of the Thirty-Fourth Asilomar Conference on Signals, Systems and Computers (Cat. No.00CH37154)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSSC.2000.911273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Three-dimensional real-time digital sonar beamforming requires 4 to 12 GFLOPS, 1 to 2 GB of memory, and about 100 MB/s of I/O bandwidth. G.E. Allen and B.L. Evans have implemented a 4-GFLOP sonar beamformer in real-time on a Sun UltraSPARC II server with 16 333-MHz processors by utilizing the Visual Instruction Set (VIS) single-instruction multiple-data (SIMD) extensions. In this paper, we rewrite the horizontal and vertical beamforming kernels to use AltiVec SIMD extension for the PowerPC. AltiVec can execute up to four 32-bit floating-point multiply and accumulate (MAC) operations per instruction. In the PowerPC implementation, we prefetch and realign data for the I28-bit SIMD registers of AltiVec. We evaluate the performance of these beamforming kernels on the PowerPC and the UltraSPARC-II to evaluate the impact of the compiler, SIMD word alignment, and cache block alignment on performance.
基于AltiVec技术的PowerPC G4处理器垂直和水平波束形成内核优化
三维实时数字声纳波束形成需要4到12 GFLOPS, 1到2gb的内存,大约100mb /s的I/O带宽。G.E. Allen和B.L. Evans利用视觉指令集(VIS)单指令多数据(SIMD)扩展,在具有16个333 mhz处理器的Sun UltraSPARC II服务器上实现了一个4-GFLOP声纳波束形成器。在本文中,我们重写了水平和垂直波束形成内核,以使用PowerPC的AltiVec SIMD扩展。AltiVec每条指令最多可以执行4个32位浮点乘法和累积(MAC)操作。在PowerPC实现中,我们为AltiVec的i28位SIMD寄存器预取和重新调整数据。我们评估了这些波束形成内核在PowerPC和UltraSPARC-II上的性能,以评估编译器、SIMD字对齐和缓存块对齐对性能的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信