考虑视差和多尺度信息的多级双目融合网络立体视频质量评估

Yingjie Feng, Sumei Li
{"title":"考虑视差和多尺度信息的多级双目融合网络立体视频质量评估","authors":"Yingjie Feng, Sumei Li","doi":"10.1109/VCIP53242.2021.9675404","DOIUrl":null,"url":null,"abstract":"Stereoscopic video quality assessment (SVQA) is of great importance to promote the development of the stereoscopic video industry. In this paper, we propose a three-branch multi-level binocular fusion convolutional neural network (MBFNet) which is highly consistent with human visual perception. Our network mainly includes three innovative structures. Firstly, we construct a multi-scale cross-dimension attention module (MSCAM) on the left and right branches to capture more critical semantic information. Then, we design a multi-level binocular fusion unit (MBFU) to fuse the features from left and right branches adaptively. Besides, a disparity compensation branch (DCB) containing an enhancement unit (EU) is added to provide disparity feature. The experimental results show that the proposed method is superior to other existing SVQA methods with state-of-the-art performance.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Stereoscopic Video Quality Assessment with Multi-level Binocular Fusion Network Considering Disparity and Multi-scale Information\",\"authors\":\"Yingjie Feng, Sumei Li\",\"doi\":\"10.1109/VCIP53242.2021.9675404\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stereoscopic video quality assessment (SVQA) is of great importance to promote the development of the stereoscopic video industry. In this paper, we propose a three-branch multi-level binocular fusion convolutional neural network (MBFNet) which is highly consistent with human visual perception. Our network mainly includes three innovative structures. Firstly, we construct a multi-scale cross-dimension attention module (MSCAM) on the left and right branches to capture more critical semantic information. Then, we design a multi-level binocular fusion unit (MBFU) to fuse the features from left and right branches adaptively. Besides, a disparity compensation branch (DCB) containing an enhancement unit (EU) is added to provide disparity feature. The experimental results show that the proposed method is superior to other existing SVQA methods with state-of-the-art performance.\",\"PeriodicalId\":114062,\"journal\":{\"name\":\"2021 International Conference on Visual Communications and Image Processing (VCIP)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Visual Communications and Image Processing (VCIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VCIP53242.2021.9675404\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP53242.2021.9675404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

立体视频质量评价(SVQA)对促进立体视频产业的发展具有重要意义。本文提出了一种与人眼视觉高度一致的三分支多级双目融合卷积神经网络(MBFNet)。我们的网络主要包括三个创新结构。首先,我们在左分支和右分支上构建多尺度跨维注意模块(MSCAM),以捕获更关键的语义信息。然后,我们设计了一个多级双目融合单元(MBFU)来自适应地融合左右分支的特征。此外,还增加了包含增强单元的视差补偿分支(DCB)来提供视差特征。实验结果表明,该方法优于现有的SVQA方法,具有较好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Stereoscopic Video Quality Assessment with Multi-level Binocular Fusion Network Considering Disparity and Multi-scale Information
Stereoscopic video quality assessment (SVQA) is of great importance to promote the development of the stereoscopic video industry. In this paper, we propose a three-branch multi-level binocular fusion convolutional neural network (MBFNet) which is highly consistent with human visual perception. Our network mainly includes three innovative structures. Firstly, we construct a multi-scale cross-dimension attention module (MSCAM) on the left and right branches to capture more critical semantic information. Then, we design a multi-level binocular fusion unit (MBFU) to fuse the features from left and right branches adaptively. Besides, a disparity compensation branch (DCB) containing an enhancement unit (EU) is added to provide disparity feature. The experimental results show that the proposed method is superior to other existing SVQA methods with state-of-the-art performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信