{"title":"CPU+GPU系统协同互预测","authors":"S. Momcilovic, A. Ilic, N. Roma, L. Sousa","doi":"10.1109/ICIP.2014.7025245","DOIUrl":null,"url":null,"abstract":"In this paper we propose an efficient method for collaborative H.264/AVC inter-prediction in heterogeneous CPU+GPU systems. In order to minimize the overall encoding time, the proposed method provides stable and balanced load distribution of the most computationally demanding video encoding modules, by relying on accurate and dynamically built functional performance models. In an extensive RD analysis, an efficient temporary dependent prediction of the search area center is proposed, which allows dependency-aware workload partitioning and efficient GPU parallelization, while preserving high compression efficiency. The proposed method also introduces efficient communication-aware techniques, which maximize data reusing, and decrease the overhead of expensive data transfers in collaborative video encoding. The experimental results show that the proposed method is able of achieving real-time video encoding for very demanding video coding parameters, i.e. full HD video format, 64×64 pixels search area and the exhaustive motion estimation.","PeriodicalId":6856,"journal":{"name":"2014 IEEE International Conference on Image Processing (ICIP)","volume":"230 1","pages":"1228-1232"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Collaborative inter-prediction on CPU+GPU systems\",\"authors\":\"S. Momcilovic, A. Ilic, N. Roma, L. Sousa\",\"doi\":\"10.1109/ICIP.2014.7025245\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we propose an efficient method for collaborative H.264/AVC inter-prediction in heterogeneous CPU+GPU systems. In order to minimize the overall encoding time, the proposed method provides stable and balanced load distribution of the most computationally demanding video encoding modules, by relying on accurate and dynamically built functional performance models. In an extensive RD analysis, an efficient temporary dependent prediction of the search area center is proposed, which allows dependency-aware workload partitioning and efficient GPU parallelization, while preserving high compression efficiency. The proposed method also introduces efficient communication-aware techniques, which maximize data reusing, and decrease the overhead of expensive data transfers in collaborative video encoding. The experimental results show that the proposed method is able of achieving real-time video encoding for very demanding video coding parameters, i.e. full HD video format, 64×64 pixels search area and the exhaustive motion estimation.\",\"PeriodicalId\":6856,\"journal\":{\"name\":\"2014 IEEE International Conference on Image Processing (ICIP)\",\"volume\":\"230 1\",\"pages\":\"1228-1232\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Conference on Image Processing (ICIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIP.2014.7025245\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Image Processing (ICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIP.2014.7025245","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In this paper we propose an efficient method for collaborative H.264/AVC inter-prediction in heterogeneous CPU+GPU systems. In order to minimize the overall encoding time, the proposed method provides stable and balanced load distribution of the most computationally demanding video encoding modules, by relying on accurate and dynamically built functional performance models. In an extensive RD analysis, an efficient temporary dependent prediction of the search area center is proposed, which allows dependency-aware workload partitioning and efficient GPU parallelization, while preserving high compression efficiency. The proposed method also introduces efficient communication-aware techniques, which maximize data reusing, and decrease the overhead of expensive data transfers in collaborative video encoding. The experimental results show that the proposed method is able of achieving real-time video encoding for very demanding video coding parameters, i.e. full HD video format, 64×64 pixels search area and the exhaustive motion estimation.