{"title":"一个可配置的h .265兼容的运动估计加速器架构,用于65nm CMOS的实时4K视频编码","authors":"Michael Braly, Aaron Stillmaker, B. Baas","doi":"10.1109/DESEC.2017.8073837","DOIUrl":null,"url":null,"abstract":"The design for a configurable motion estimation accelerator is presented and demonstrated as suitable for realtime digital 4K as well as H.265/HEVC. The design has two 4-KB frame memories necessary to hold the active and reference frames, designed using a standard cell memory technique, with line-based pixel write, and block-based pixel accesses. It computes a 16 pixel sum of absolute differences (SAD)s per cycle, in a 4 × 4 block, and is pipelined to take advantage of the high throughput block pixel memories. The architecture supports configurable search patterns and threshold-based early termination which allow for run-time tradeoffs to be made between pixel throughput and final quality of result. CMEACC is independently clocked and can operate up to 812 MHz at 1.3 V in 65 nm CMOS, achieving a throughput of 105 MPixel/sec for a single instance while consuming 0.933 pJ × sec/Pixel, and occupying approximately 1.04 mm2 post place-and-route in 65 nm CMOS. While operating at 0.9 V, the presented design consumes 0.393 nJ/Pixel, which scales to 8.06 mW at 22.26 FPS in 720p.","PeriodicalId":92346,"journal":{"name":"DASC-PICom-DataCom-CyberSciTech 2017 : 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing ; 2017 IEEE 15th International Conference on Pervasive Intelligence and Computing ; 2017 IEEE 3rd International...","volume":"50 1","pages":"79-85"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A configurable H.265-compatible motion estimation accelerator architecture for realtime 4K video encoding in 65 nm CMOS\",\"authors\":\"Michael Braly, Aaron Stillmaker, B. Baas\",\"doi\":\"10.1109/DESEC.2017.8073837\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The design for a configurable motion estimation accelerator is presented and demonstrated as suitable for realtime digital 4K as well as H.265/HEVC. The design has two 4-KB frame memories necessary to hold the active and reference frames, designed using a standard cell memory technique, with line-based pixel write, and block-based pixel accesses. It computes a 16 pixel sum of absolute differences (SAD)s per cycle, in a 4 × 4 block, and is pipelined to take advantage of the high throughput block pixel memories. The architecture supports configurable search patterns and threshold-based early termination which allow for run-time tradeoffs to be made between pixel throughput and final quality of result. CMEACC is independently clocked and can operate up to 812 MHz at 1.3 V in 65 nm CMOS, achieving a throughput of 105 MPixel/sec for a single instance while consuming 0.933 pJ × sec/Pixel, and occupying approximately 1.04 mm2 post place-and-route in 65 nm CMOS. While operating at 0.9 V, the presented design consumes 0.393 nJ/Pixel, which scales to 8.06 mW at 22.26 FPS in 720p.\",\"PeriodicalId\":92346,\"journal\":{\"name\":\"DASC-PICom-DataCom-CyberSciTech 2017 : 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing ; 2017 IEEE 15th International Conference on Pervasive Intelligence and Computing ; 2017 IEEE 3rd International...\",\"volume\":\"50 1\",\"pages\":\"79-85\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"DASC-PICom-DataCom-CyberSciTech 2017 : 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing ; 2017 IEEE 15th International Conference on Pervasive Intelligence and Computing ; 2017 IEEE 3rd International...\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DESEC.2017.8073837\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"DASC-PICom-DataCom-CyberSciTech 2017 : 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing ; 2017 IEEE 15th International Conference on Pervasive Intelligence and Computing ; 2017 IEEE 3rd International...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DESEC.2017.8073837","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A configurable H.265-compatible motion estimation accelerator architecture for realtime 4K video encoding in 65 nm CMOS
The design for a configurable motion estimation accelerator is presented and demonstrated as suitable for realtime digital 4K as well as H.265/HEVC. The design has two 4-KB frame memories necessary to hold the active and reference frames, designed using a standard cell memory technique, with line-based pixel write, and block-based pixel accesses. It computes a 16 pixel sum of absolute differences (SAD)s per cycle, in a 4 × 4 block, and is pipelined to take advantage of the high throughput block pixel memories. The architecture supports configurable search patterns and threshold-based early termination which allow for run-time tradeoffs to be made between pixel throughput and final quality of result. CMEACC is independently clocked and can operate up to 812 MHz at 1.3 V in 65 nm CMOS, achieving a throughput of 105 MPixel/sec for a single instance while consuming 0.933 pJ × sec/Pixel, and occupying approximately 1.04 mm2 post place-and-route in 65 nm CMOS. While operating at 0.9 V, the presented design consumes 0.393 nJ/Pixel, which scales to 8.06 mW at 22.26 FPS in 720p.