Visual Comfort Classification for Stereoscopic Videos Based on Two-Stream Recurrent Neural Network with Multi-level Attention

Weize Gan, Danhong Peng, Yuzhen Niu
{"title":"Visual Comfort Classification for Stereoscopic Videos Based on Two-Stream Recurrent Neural Network with Multi-level Attention","authors":"Weize Gan, Danhong Peng, Yuzhen Niu","doi":"10.1145/3561613.3561628","DOIUrl":null,"url":null,"abstract":"Due to the differences in visual systems between children and adults, a professional stereoscopic 3D video may not be comfortable for children. In this paper, we aim to answer whether a stereoscopic video is comfortable for children to watch by solving the visual comfort classification for stereoscopic videos. In particular, we propose a two-stream recurrent neural network (RNN) with multi-level attention for the visual comfort classification for stereoscopic videos. Firstly, we propose a two-stream RNN to extract and fuse spatial and temporal features from video frames and disparity maps. Furthermore, we propose using multi-level attention to effectively enhance the features in frame level, shot level, and finally video level. In addition, to our best knowledge, we establish the first high-definition stereoscopic 3D video dataset for performance evaluation. Experimental results show that our proposed model can effectively classify professional stereoscopic videos into visually comfortable for children or adults only.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"354 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Control and Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3561613.3561628","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Due to the differences in visual systems between children and adults, a professional stereoscopic 3D video may not be comfortable for children. In this paper, we aim to answer whether a stereoscopic video is comfortable for children to watch by solving the visual comfort classification for stereoscopic videos. In particular, we propose a two-stream recurrent neural network (RNN) with multi-level attention for the visual comfort classification for stereoscopic videos. Firstly, we propose a two-stream RNN to extract and fuse spatial and temporal features from video frames and disparity maps. Furthermore, we propose using multi-level attention to effectively enhance the features in frame level, shot level, and finally video level. In addition, to our best knowledge, we establish the first high-definition stereoscopic 3D video dataset for performance evaluation. Experimental results show that our proposed model can effectively classify professional stereoscopic videos into visually comfortable for children or adults only.
基于多级关注双流递归神经网络的立体视频视觉舒适度分类
由于儿童和成人的视觉系统存在差异,专业的立体3D视频可能会让儿童感到不舒服。在本文中,我们旨在通过解决立体视频的视觉舒适度分类来回答儿童观看立体视频是否舒适。特别地,我们提出了一种具有多级关注的双流递归神经网络(RNN)用于立体视频的视觉舒适度分类。首先,我们提出了一种双流RNN算法,从视频帧和视差图中提取和融合时空特征。在此基础上,我们提出了利用多层次注意力来有效增强图像在帧级、镜头级和视频级的特征。此外,据我们所知,我们建立了第一个用于性能评估的高清立体3D视频数据集。实验结果表明,该模型可以有效地将专业立体视频分为儿童视觉舒适和成人视觉舒适两类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信