Ziyin Huang, Yue Cao, Sik-Ho Tsang, Yui-Lam Chan, K. Lam
{"title":"使用双输入CNN增强屏幕内容视频的质量","authors":"Ziyin Huang, Yue Cao, Sik-Ho Tsang, Yui-Lam Chan, K. Lam","doi":"10.23919/APSIPAASC55919.2022.9979969","DOIUrl":null,"url":null,"abstract":"In recent years, the video quality enhancement techniques have made a significant breakthrough, from the traditional methods, such as deblocking filter (DF) and sample additive offset (SAO), to deep learning-based approaches. While screen content coding (SCC) has become an important extension in High Efficiency Video Coding (HEVC), the existing approaches mainly focus on improving the quality of natural sequences in HEVC, not the screen content (SC) sequences in SCC. Therefore, we proposed a dual-input model for quality enhancement in SCC. One is the main branch with the image as input. Another one is the mask branch with side information extracted from the coded bitstream. Specifically, a mask branch is designed so that the coding unit (CU) information and the mode information are utilized as input, to assist the convolutional network at the main branch to further improve the video quality thereby the coding efficiency. Moreover, due to the limited number of SC videos, a new SCC dataset, namely PolyUSCC, is established. With our proposed dual-input technique, compared with the conventional SCC, BD-rates are further reduced 3.81% and 3.07%, by adding our mask branch onto two state-of-the-art models, DnCNN and DCAD, respectively.","PeriodicalId":382967,"journal":{"name":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"143 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Quality Enhancement of Screen Content Video using Dual-input CNN\",\"authors\":\"Ziyin Huang, Yue Cao, Sik-Ho Tsang, Yui-Lam Chan, K. Lam\",\"doi\":\"10.23919/APSIPAASC55919.2022.9979969\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the video quality enhancement techniques have made a significant breakthrough, from the traditional methods, such as deblocking filter (DF) and sample additive offset (SAO), to deep learning-based approaches. While screen content coding (SCC) has become an important extension in High Efficiency Video Coding (HEVC), the existing approaches mainly focus on improving the quality of natural sequences in HEVC, not the screen content (SC) sequences in SCC. Therefore, we proposed a dual-input model for quality enhancement in SCC. One is the main branch with the image as input. Another one is the mask branch with side information extracted from the coded bitstream. Specifically, a mask branch is designed so that the coding unit (CU) information and the mode information are utilized as input, to assist the convolutional network at the main branch to further improve the video quality thereby the coding efficiency. Moreover, due to the limited number of SC videos, a new SCC dataset, namely PolyUSCC, is established. With our proposed dual-input technique, compared with the conventional SCC, BD-rates are further reduced 3.81% and 3.07%, by adding our mask branch onto two state-of-the-art models, DnCNN and DCAD, respectively.\",\"PeriodicalId\":382967,\"journal\":{\"name\":\"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)\",\"volume\":\"143 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/APSIPAASC55919.2022.9979969\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/APSIPAASC55919.2022.9979969","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
近年来,视频质量增强技术取得了重大突破,从传统的去块滤波(DF)和样本加性偏移(SAO)等方法发展到基于深度学习的方法。虽然屏幕内容编码(SCC)已经成为高效视频编码(HEVC)的一个重要扩展,但现有的方法主要集中在提高HEVC中自然序列的质量,而不是提高SCC中的屏幕内容序列的质量。因此,我们提出了一种双输入模型来提高SCC的质量。一个是以图像作为输入的主分支。另一种是从编码的比特流中提取侧信息的掩码分支。具体来说,设计了一个掩码支路,利用编码单元(coding unit, CU)信息和模式信息作为输入,辅助主支路的卷积网络进一步提高视频质量从而提高编码效率。此外,由于SC视频数量有限,我们建立了一个新的SCC数据集PolyUSCC。采用我们提出的双输入技术,与传统的SCC相比,通过将我们的掩膜分支分别添加到DnCNN和DCAD两个最先进的模型中,bd率进一步降低了3.81%和3.07%。
Quality Enhancement of Screen Content Video using Dual-input CNN
In recent years, the video quality enhancement techniques have made a significant breakthrough, from the traditional methods, such as deblocking filter (DF) and sample additive offset (SAO), to deep learning-based approaches. While screen content coding (SCC) has become an important extension in High Efficiency Video Coding (HEVC), the existing approaches mainly focus on improving the quality of natural sequences in HEVC, not the screen content (SC) sequences in SCC. Therefore, we proposed a dual-input model for quality enhancement in SCC. One is the main branch with the image as input. Another one is the mask branch with side information extracted from the coded bitstream. Specifically, a mask branch is designed so that the coding unit (CU) information and the mode information are utilized as input, to assist the convolutional network at the main branch to further improve the video quality thereby the coding efficiency. Moreover, due to the limited number of SC videos, a new SCC dataset, namely PolyUSCC, is established. With our proposed dual-input technique, compared with the conventional SCC, BD-rates are further reduced 3.81% and 3.07%, by adding our mask branch onto two state-of-the-art models, DnCNN and DCAD, respectively.