{"title":"Source adaptive software 2D iDCT with SIMD","authors":"L. Winger","doi":"10.1109/ICASSP.2000.860191","DOIUrl":null,"url":null,"abstract":"This paper presents a fast two-dimensional inverse discrete cosine transform that adapts to compressed video source statistics to reduce execution time. iDCT algorithms for sparse blocks eliminate calculations for some zero coefficients and are implemented with quad-word parallel single-instruction-multiple-data (SIMD) multimedia instructions. It is observed that end-of-block marker value histograms vary little within single shots. An adaptive control mechanism is proposed that chooses the optimal set of iDCTs to prepare for an entire shot from its 1st frames (to reduce software overheads and penalties). This introduces no degradation of decoded video quality compared with a conventional SIMD 8/spl times/8 iDCT implemented with Intel MMX instructions. It is confirmed that execution time is reduced an additional 15% with Murata's method for 4 Mbps MPEG2 natural video. In comparison, execution time is reduced 22% with a modified version Murata's method, and by 35% with the new source adaptive method.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2000.860191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
This paper presents a fast two-dimensional inverse discrete cosine transform that adapts to compressed video source statistics to reduce execution time. iDCT algorithms for sparse blocks eliminate calculations for some zero coefficients and are implemented with quad-word parallel single-instruction-multiple-data (SIMD) multimedia instructions. It is observed that end-of-block marker value histograms vary little within single shots. An adaptive control mechanism is proposed that chooses the optimal set of iDCTs to prepare for an entire shot from its 1st frames (to reduce software overheads and penalties). This introduces no degradation of decoded video quality compared with a conventional SIMD 8/spl times/8 iDCT implemented with Intel MMX instructions. It is confirmed that execution time is reduced an additional 15% with Murata's method for 4 Mbps MPEG2 natural video. In comparison, execution time is reduced 22% with a modified version Murata's method, and by 35% with the new source adaptive method.