{"title":"利用三维卷积神经网络预测立体视频中的晕动病。","authors":"Tae Min Lee, Jong-Chul Yoon, In-Kwon Lee","doi":"10.1109/TVCG.2019.2899186","DOIUrl":null,"url":null,"abstract":"<p><p>In this paper, we propose a three-dimensional (3D) convolutional neural network (CNN)-based method for predicting the degree of motion sickness induced by a 360° stereoscopic video. We consider the user's eye movement as a new feature, in addition to the motion velocity and depth features of a video used in previous work. For this purpose, we use saliency, optical flow, and disparity maps of an input video, which represent eye movement, velocity, and depth, respectively, as the input of the 3D CNN. To train our machine-learning model, we extend the dataset established in the previous work using two data augmentation techniques: frame shifting and pixel shifting. Consequently, our model can predict the degree of motion sickness more precisely than the previous method, and the results have a more similar correlation to the distribution of ground-truth sickness.</p>","PeriodicalId":13376,"journal":{"name":"IEEE Transactions on Visualization and Computer Graphics","volume":"25 5","pages":"1919-1927"},"PeriodicalIF":4.7000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TVCG.2019.2899186","citationCount":"43","resultStr":"{\"title\":\"Motion Sickness Prediction in Stereoscopic Videos using 3D Convolutional Neural Networks.\",\"authors\":\"Tae Min Lee, Jong-Chul Yoon, In-Kwon Lee\",\"doi\":\"10.1109/TVCG.2019.2899186\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>In this paper, we propose a three-dimensional (3D) convolutional neural network (CNN)-based method for predicting the degree of motion sickness induced by a 360° stereoscopic video. We consider the user's eye movement as a new feature, in addition to the motion velocity and depth features of a video used in previous work. For this purpose, we use saliency, optical flow, and disparity maps of an input video, which represent eye movement, velocity, and depth, respectively, as the input of the 3D CNN. To train our machine-learning model, we extend the dataset established in the previous work using two data augmentation techniques: frame shifting and pixel shifting. Consequently, our model can predict the degree of motion sickness more precisely than the previous method, and the results have a more similar correlation to the distribution of ground-truth sickness.</p>\",\"PeriodicalId\":13376,\"journal\":{\"name\":\"IEEE Transactions on Visualization and Computer Graphics\",\"volume\":\"25 5\",\"pages\":\"1919-1927\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/TVCG.2019.2899186\",\"citationCount\":\"43\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Visualization and Computer Graphics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/TVCG.2019.2899186\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2019/2/15 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Visualization and Computer Graphics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TVCG.2019.2899186","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/2/15 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Motion Sickness Prediction in Stereoscopic Videos using 3D Convolutional Neural Networks.
In this paper, we propose a three-dimensional (3D) convolutional neural network (CNN)-based method for predicting the degree of motion sickness induced by a 360° stereoscopic video. We consider the user's eye movement as a new feature, in addition to the motion velocity and depth features of a video used in previous work. For this purpose, we use saliency, optical flow, and disparity maps of an input video, which represent eye movement, velocity, and depth, respectively, as the input of the 3D CNN. To train our machine-learning model, we extend the dataset established in the previous work using two data augmentation techniques: frame shifting and pixel shifting. Consequently, our model can predict the degree of motion sickness more precisely than the previous method, and the results have a more similar correlation to the distribution of ground-truth sickness.
期刊介绍:
TVCG is a scholarly, archival journal published monthly. Its Editorial Board strives to publish papers that present important research results and state-of-the-art seminal papers in computer graphics, visualization, and virtual reality. Specific topics include, but are not limited to: rendering technologies; geometric modeling and processing; shape analysis; graphics hardware; animation and simulation; perception, interaction and user interfaces; haptics; computational photography; high-dynamic range imaging and display; user studies and evaluation; biomedical visualization; volume visualization and graphics; visual analytics for machine learning; topology-based visualization; visual programming and software visualization; visualization in data science; virtual reality, augmented reality and mixed reality; advanced display technology, (e.g., 3D, immersive and multi-modal displays); applications of computer graphics and visualization.