{"title":"一种针对无线基带优化的DSP架构","authors":"C. Rowen, P. Nuth, S. Fiske","doi":"10.1109/SOCC.2009.5335658","DOIUrl":null,"url":null,"abstract":"The high computation demands of next generation cellular and broadcast wireless require both higher efficiency and greater flexibility in baseband processing. This paper introduces a new DSP architecture optimized for baseband applications, especially applications with heavy workload of complex filtering, FFT and MIMO matrix operations such as LTE. The Tensilica ConnX Baseband Engine processor core implements a 3-issue VLIW, 8-way SIMD architecture. It can perform 16 multiply-add operations per cycle, and executes a full radix-4 FFT butterfly or 4 complex FIR filter taps per cycle. It directly implements vector division and reciprocal square root operations. At 400MHz, it provides almost 13GB per second of memory bandwidth. The rich programming environment, including vectorization of scalar C applications, allows easy deployment into cellular base-station, femto-cell and other software-agile radio applications, and into multi-standard broadcast receivers.","PeriodicalId":389625,"journal":{"name":"2009 International Symposium on System-on-Chip","volume":"2010 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"A DSP architecture optimized for wireless baseband\",\"authors\":\"C. Rowen, P. Nuth, S. Fiske\",\"doi\":\"10.1109/SOCC.2009.5335658\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The high computation demands of next generation cellular and broadcast wireless require both higher efficiency and greater flexibility in baseband processing. This paper introduces a new DSP architecture optimized for baseband applications, especially applications with heavy workload of complex filtering, FFT and MIMO matrix operations such as LTE. The Tensilica ConnX Baseband Engine processor core implements a 3-issue VLIW, 8-way SIMD architecture. It can perform 16 multiply-add operations per cycle, and executes a full radix-4 FFT butterfly or 4 complex FIR filter taps per cycle. It directly implements vector division and reciprocal square root operations. At 400MHz, it provides almost 13GB per second of memory bandwidth. The rich programming environment, including vectorization of scalar C applications, allows easy deployment into cellular base-station, femto-cell and other software-agile radio applications, and into multi-standard broadcast receivers.\",\"PeriodicalId\":389625,\"journal\":{\"name\":\"2009 International Symposium on System-on-Chip\",\"volume\":\"2010 2\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-10-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 International Symposium on System-on-Chip\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SOCC.2009.5335658\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Symposium on System-on-Chip","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SOCC.2009.5335658","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A DSP architecture optimized for wireless baseband
The high computation demands of next generation cellular and broadcast wireless require both higher efficiency and greater flexibility in baseband processing. This paper introduces a new DSP architecture optimized for baseband applications, especially applications with heavy workload of complex filtering, FFT and MIMO matrix operations such as LTE. The Tensilica ConnX Baseband Engine processor core implements a 3-issue VLIW, 8-way SIMD architecture. It can perform 16 multiply-add operations per cycle, and executes a full radix-4 FFT butterfly or 4 complex FIR filter taps per cycle. It directly implements vector division and reciprocal square root operations. At 400MHz, it provides almost 13GB per second of memory bandwidth. The rich programming environment, including vectorization of scalar C applications, allows easy deployment into cellular base-station, femto-cell and other software-agile radio applications, and into multi-standard broadcast receivers.