G. Akkad, A. Mansour, B. Elhassan, E. Inaty, R. Ayoubi
{"title":"Two Stages Parallel LMS Structure: A Pipelined Hardware Architecture","authors":"G. Akkad, A. Mansour, B. Elhassan, E. Inaty, R. Ayoubi","doi":"10.23919/Eusipco47968.2020.9287770","DOIUrl":null,"url":null,"abstract":"Modern wireless communication systems have tighten the requirements of adaptive beamformers when implemented on Field Programmable Gate Array (FPGA). The set requirements imposed additional constraints such as designing a high throughput, low complexity system with fast convergence and low steady state error. Recently, a parallel multi-stage least mean square (pLMS) structure is proposed to mitigate the listed constraints. pLMS is a two stages least mean square (LMS) operating in parallel and connected by an error feedback. To form the total pLMS error, the second LMS stage (LMS2) error is delayed by one sample and fed-back to combine with that of the first LMS stage (LMS1). pLMS provides accelerated convergence while maintaining minimal steady state error and a computational complexity of order O(N), where N represent the number of antenna elements. However, pipelining the pLMS structure is still difficult due to the LMS coefficient update loop. Thus, in this paper, we propose the application of the delay and sum relaxed look ahead technique to design a high throughput pipelined hardware architecture for the pLMS. Hence, the delayed pLMS (DpLMS) is obtained. Simulation and synthesis result, highlight the superior performance of the DpLMS in presenting a high throughput architecture while preserving accelerated convergence, low steady state error and low computational complexity. DpLMS operates at a maximum frequency of 208.33 MHz and is obtained at the cost of a marginal increase in resource requirements, i.e. additional delay registers compared to the original pLMS design.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"51 1","pages":"2363-2367"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 28th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/Eusipco47968.2020.9287770","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Modern wireless communication systems have tighten the requirements of adaptive beamformers when implemented on Field Programmable Gate Array (FPGA). The set requirements imposed additional constraints such as designing a high throughput, low complexity system with fast convergence and low steady state error. Recently, a parallel multi-stage least mean square (pLMS) structure is proposed to mitigate the listed constraints. pLMS is a two stages least mean square (LMS) operating in parallel and connected by an error feedback. To form the total pLMS error, the second LMS stage (LMS2) error is delayed by one sample and fed-back to combine with that of the first LMS stage (LMS1). pLMS provides accelerated convergence while maintaining minimal steady state error and a computational complexity of order O(N), where N represent the number of antenna elements. However, pipelining the pLMS structure is still difficult due to the LMS coefficient update loop. Thus, in this paper, we propose the application of the delay and sum relaxed look ahead technique to design a high throughput pipelined hardware architecture for the pLMS. Hence, the delayed pLMS (DpLMS) is obtained. Simulation and synthesis result, highlight the superior performance of the DpLMS in presenting a high throughput architecture while preserving accelerated convergence, low steady state error and low computational complexity. DpLMS operates at a maximum frequency of 208.33 MHz and is obtained at the cost of a marginal increase in resource requirements, i.e. additional delay registers compared to the original pLMS design.