{"title":"Deterministic Annealing Based Transform Domain Temporal Predictor Design for Adaptive Video Coding","authors":"B. Vishwanath, Tejaswi Nanjundaswamy, K. Rose","doi":"10.1109/DCC.2019.00027","DOIUrl":null,"url":null,"abstract":"Current video coders employ motion compensated pixel-to-pixel prediction, which largely ignores significant spatial correlations and the fact that true temporal correlations vary with spatial frequency. Earlier work from our lab proposed to first spatially decorrelate the block of pixels by performing temporal prediction in the transform domain, and to effectively account for both spatial and temporal correlations. To adapt to variations in video signal statistics, the encoder switches between a set of appropriately designed prediction modes.This setting critically depends on efficient offline learning of transform domain temporal prediction modes. Significant challenges include: i) issues of instability and mismatched statistics inherent to closed loop design; and ii) severe non-convexity of the cost function trapping the system in poor local minima. Statistics mismatch is tackled by an appropriate paradigm for system design in a stable open loop fashion, but which asymptotically mimics closed loop operation. The non-convexity is handled by deterministic annealing, a powerful non-convex optimization tool whose probabilistic formulation allows for direct optimization of the cost function with respect to the discrete set of prediction modes, and whose annealing schedule avoids poor local minima. Experimental results validate the method's efficacy.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Data Compression Conference (DCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2019.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Current video coders employ motion compensated pixel-to-pixel prediction, which largely ignores significant spatial correlations and the fact that true temporal correlations vary with spatial frequency. Earlier work from our lab proposed to first spatially decorrelate the block of pixels by performing temporal prediction in the transform domain, and to effectively account for both spatial and temporal correlations. To adapt to variations in video signal statistics, the encoder switches between a set of appropriately designed prediction modes.This setting critically depends on efficient offline learning of transform domain temporal prediction modes. Significant challenges include: i) issues of instability and mismatched statistics inherent to closed loop design; and ii) severe non-convexity of the cost function trapping the system in poor local minima. Statistics mismatch is tackled by an appropriate paradigm for system design in a stable open loop fashion, but which asymptotically mimics closed loop operation. The non-convexity is handled by deterministic annealing, a powerful non-convex optimization tool whose probabilistic formulation allows for direct optimization of the cost function with respect to the discrete set of prediction modes, and whose annealing schedule avoids poor local minima. Experimental results validate the method's efficacy.