{"title":"利用定点学习方法实现视频的二维到三维转换","authors":"Nidhi Chahal, S. Chaudhury","doi":"10.1109/ICECE.2016.7853844","DOIUrl":null,"url":null,"abstract":"The depth cues from multiple images are useful in accurate depth extraction while monocular cues from single still image are more versatile. In our paper, monocular cue which gives useful information about single frame and depth from motion using optical flow estimated from consecutive video frames are used to produce final depth maps. The machine learning approach is promising and new research direction in the field of depth estimation and thus 2-D to 3-D conversion. A fast automatic technique is proposed which utilizes a fixed point learning framework for the accurate estimation of depth maps of test images. For this task, a contextual prediction function is generated using training database of 2-D color and ground truth depth images. The depth maps obtained from monocular and motion depth cues of input video frames are used as input features for learning process. The depths generated from fixed point model are more accurate and reliable than MRF fusion of these depth cues. The stereo pairs are generated using depth maps predicted from fixed point learning. These final stereo pairs are converted to 3-D output video which is displayed on 3-DTV. For subjective evaluation, MOS score is calculated by showing final 3-D video to different viewers using 3-D glasses.","PeriodicalId":122930,"journal":{"name":"2016 9th International Conference on Electrical and Computer Engineering (ICECE)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"2-D to 3-D conversion of videos using fixed point learning approach\",\"authors\":\"Nidhi Chahal, S. Chaudhury\",\"doi\":\"10.1109/ICECE.2016.7853844\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The depth cues from multiple images are useful in accurate depth extraction while monocular cues from single still image are more versatile. In our paper, monocular cue which gives useful information about single frame and depth from motion using optical flow estimated from consecutive video frames are used to produce final depth maps. The machine learning approach is promising and new research direction in the field of depth estimation and thus 2-D to 3-D conversion. A fast automatic technique is proposed which utilizes a fixed point learning framework for the accurate estimation of depth maps of test images. For this task, a contextual prediction function is generated using training database of 2-D color and ground truth depth images. The depth maps obtained from monocular and motion depth cues of input video frames are used as input features for learning process. The depths generated from fixed point model are more accurate and reliable than MRF fusion of these depth cues. The stereo pairs are generated using depth maps predicted from fixed point learning. These final stereo pairs are converted to 3-D output video which is displayed on 3-DTV. For subjective evaluation, MOS score is calculated by showing final 3-D video to different viewers using 3-D glasses.\",\"PeriodicalId\":122930,\"journal\":{\"name\":\"2016 9th International Conference on Electrical and Computer Engineering (ICECE)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 9th International Conference on Electrical and Computer Engineering (ICECE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICECE.2016.7853844\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 9th International Conference on Electrical and Computer Engineering (ICECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECE.2016.7853844","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
2-D to 3-D conversion of videos using fixed point learning approach
The depth cues from multiple images are useful in accurate depth extraction while monocular cues from single still image are more versatile. In our paper, monocular cue which gives useful information about single frame and depth from motion using optical flow estimated from consecutive video frames are used to produce final depth maps. The machine learning approach is promising and new research direction in the field of depth estimation and thus 2-D to 3-D conversion. A fast automatic technique is proposed which utilizes a fixed point learning framework for the accurate estimation of depth maps of test images. For this task, a contextual prediction function is generated using training database of 2-D color and ground truth depth images. The depth maps obtained from monocular and motion depth cues of input video frames are used as input features for learning process. The depths generated from fixed point model are more accurate and reliable than MRF fusion of these depth cues. The stereo pairs are generated using depth maps predicted from fixed point learning. These final stereo pairs are converted to 3-D output video which is displayed on 3-DTV. For subjective evaluation, MOS score is calculated by showing final 3-D video to different viewers using 3-D glasses.