{"title":"A configurable H.265-compatible motion estimation accelerator architecture for realtime 4K video encoding in 65 nm CMOS","authors":"Michael Braly, Aaron Stillmaker, B. Baas","doi":"10.1109/DESEC.2017.8073837","DOIUrl":null,"url":null,"abstract":"The design for a configurable motion estimation accelerator is presented and demonstrated as suitable for realtime digital 4K as well as H.265/HEVC. The design has two 4-KB frame memories necessary to hold the active and reference frames, designed using a standard cell memory technique, with line-based pixel write, and block-based pixel accesses. It computes a 16 pixel sum of absolute differences (SAD)s per cycle, in a 4 × 4 block, and is pipelined to take advantage of the high throughput block pixel memories. The architecture supports configurable search patterns and threshold-based early termination which allow for run-time tradeoffs to be made between pixel throughput and final quality of result. CMEACC is independently clocked and can operate up to 812 MHz at 1.3 V in 65 nm CMOS, achieving a throughput of 105 MPixel/sec for a single instance while consuming 0.933 pJ × sec/Pixel, and occupying approximately 1.04 mm2 post place-and-route in 65 nm CMOS. While operating at 0.9 V, the presented design consumes 0.393 nJ/Pixel, which scales to 8.06 mW at 22.26 FPS in 720p.","PeriodicalId":92346,"journal":{"name":"DASC-PICom-DataCom-CyberSciTech 2017 : 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing ; 2017 IEEE 15th International Conference on Pervasive Intelligence and Computing ; 2017 IEEE 3rd International...","volume":"50 1","pages":"79-85"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"DASC-PICom-DataCom-CyberSciTech 2017 : 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing ; 2017 IEEE 15th International Conference on Pervasive Intelligence and Computing ; 2017 IEEE 3rd International...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DESEC.2017.8073837","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The design for a configurable motion estimation accelerator is presented and demonstrated as suitable for realtime digital 4K as well as H.265/HEVC. The design has two 4-KB frame memories necessary to hold the active and reference frames, designed using a standard cell memory technique, with line-based pixel write, and block-based pixel accesses. It computes a 16 pixel sum of absolute differences (SAD)s per cycle, in a 4 × 4 block, and is pipelined to take advantage of the high throughput block pixel memories. The architecture supports configurable search patterns and threshold-based early termination which allow for run-time tradeoffs to be made between pixel throughput and final quality of result. CMEACC is independently clocked and can operate up to 812 MHz at 1.3 V in 65 nm CMOS, achieving a throughput of 105 MPixel/sec for a single instance while consuming 0.933 pJ × sec/Pixel, and occupying approximately 1.04 mm2 post place-and-route in 65 nm CMOS. While operating at 0.9 V, the presented design consumes 0.393 nJ/Pixel, which scales to 8.06 mW at 22.26 FPS in 720p.