Weidong Hu, Mingying Gong, Yanhui Hong, Lifeng Sun, Shiqiang Yang
{"title":"High-resolution 3D reconstruction for complex color scenes with structured light","authors":"Weidong Hu, Mingying Gong, Yanhui Hong, Lifeng Sun, Shiqiang Yang","doi":"10.1109/ICMEW.2014.6890610","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890610","url":null,"abstract":"High-resolution 3D reconstruction has become increasingly important due to its wide applications such as 3D printing, entertainment, engineering etc. In this paper, a high-resolution 3D reconstruction system based on structured light is proposed. Our proposed system contains a consuming projector and a digital camera. Unlike the traditional structured light which directly projects patterns into scenes, we project extended patterns generated from a 2D mother pattern and analysis the luminance changes of pixels of captured images to restore the mother pattern. With the help of extended patterns, the proposed system can not only deal with complex scenes robustly, but also deal with color scenes. And since the positions are matched based on 2D perfect map, external parameters of the projector and digital camera are not required during the evaluation. Experimental results show that the proposed system can generates high-resolution surfaces.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122800482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"System for multiple people identification and tracking in video","authors":"T. Zhang","doi":"10.1109/ICMEW.2014.6890614","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890614","url":null,"abstract":"In this paper, a software system is presented that supports finding, tagging, identifying and tracking multiple people in videos with uncontrolled capturing conditions. The work was focused on two aspects. One is to build a parallel video processing pipeline that integrates image analysis modules such as face detection, recognition and tracking, efficiently and smoothly, so that multiple people can be simultaneously tracked in real time. Another aspect is to make innovations to each of the major image processing modules so that they are both fast and robust to variations in pose, illumination, occlusions and so on. Written in C++, this demo runs on a mainstream laptop. It can instantly recognize and constantly track multiple subjects in live or recorded videos.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114598558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Ghinea, R. Kannan, Sridhar Swaminathan, Suresh Kannaiyan
{"title":"A novel user-centered design for personalized video summarization","authors":"G. Ghinea, R. Kannan, Sridhar Swaminathan, Suresh Kannaiyan","doi":"10.1109/ICMEW.2014.6890642","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890642","url":null,"abstract":"In the past, several automatic video summarization systems had been proposed to generate video summary. However, a generic video summary that is generated based only on audio, visual and textual saliencies will not satisfy every user. This paper proposes a novel system for generating semantically meaningful personalized video summaries, which are tailored to the individual user's preferences over video semantics. Each video shot is represented using a semantic multinomial which is a vector of posterior semantic concept probabilities. The proposed system stitches video summary based on summary time span and top-ranked shots that are semantically relevant to the user's preferences. The proposed summarization system is evaluated using both quantitative and subjective evaluation metrics. The experimental results on the performance of the proposed video summarization system are encouraging.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129553620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Z. Chai, Dong Wang, Tian Wang, Jian-zhuo Liu, Xinzi Zhang, Yihong Gong
{"title":"Evaluation on Huawei Accurate and Fast Mobile Video Annotation Challenge","authors":"Z. Chai, Dong Wang, Tian Wang, Jian-zhuo Liu, Xinzi Zhang, Yihong Gong","doi":"10.1109/ICMEW.2014.6890607","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890607","url":null,"abstract":"Massive user generated content (UGC) videos are produced each day on the Internet. These videos have become a very important integrant in existing social networking services (SNS). However, unlike professional films, the content of UGC videos is usually unstructured and lacks contextual annotation for management. The motivation behind Huawei Accurate and Fast Mobile Video Annotation Challenge (MoVAC) is to evaluate different algorithms on the generation of local annotation on UGC videos under the same protocol, and to compare them not only in accuracy but also in efficiency. More than 15 teams from different countries have enrolled in this competition, and in the final round 17 submissions with valid result from 6 teams were received. The results show that recent popular deep convolutional neural networks (CNN) could be a potentially good solution to this task.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129305351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gender estimation for SNS user profiling using automatic image annotation","authors":"Xiaojun Ma, Y. Tsuboshita, N. Kato","doi":"10.1109/ICMEW.2014.6890569","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890569","url":null,"abstract":"User profiling for Social Network Services (SNS) has gained great attention because of its potential values in identifying target population, which is very informative for marketing. Many studies have been conducted to estimate SNS user profiles using text analysis. However, in spite of the huge quantities of image resources on SNS, no previous work has specifically explored user profiles by automatic image annotation techniques. This paper addresses the problem of inferring a SNS user's gender by automatic image annotation. The proposed method involves learning a model to annotate SNS images and integrating annotation scores of images to infer a user's gender. Evaluation based on Twitter data demonstrates promising results.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124675938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Video security with human identification and tracking","authors":"T. Zhang","doi":"10.1109/ICMEW.2014.6890591","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890591","url":null,"abstract":"With the pervasiveness of monitoring cameras installed in public places, schools, hospitals and homes, video analytics technologies for interpreting the generated video content are becoming more and more relevant to people's lives. Along this context, we develop a human-centric video surveillance system that identifies and tracks people in a given scene. In this paper, a parallel processing pipeline is proposed that integrates image processing modules in the system, such as face detection, person recognition and tracking, efficiently and smoothly, so that multiple people can be simultaneously tracked in real time. Furthermore, significant innovations are involved in this work in making each of the major image analysis modules both fast and robust to variations in pose, illumination, occlusions and so on. A demonstration software has been implemented that supports finding, tagging, identifying and tracking people in live or recorded videos with uncontrolled capturing conditions.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123235373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoqiong Su, Chongyang Zhang, Lin Mei, Wenfei Wang, Jiadi Yang
{"title":"Vehicle face recognition using weighted visual patches","authors":"Xiaoqiong Su, Chongyang Zhang, Lin Mei, Wenfei Wang, Jiadi Yang","doi":"10.1109/ICMEW.2014.6890584","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890584","url":null,"abstract":"Distinguishing similar objects is a challenging task; visual patches especially salient or discriminative patches are widely adopted in the state-of-the-art recognition methods to enhance the discovery performance. Considering the fact that different patches have different contributions to the recognition, we develop a fine-grained object recognition algorithm using location and distinction weighted visual patches book, which has two contributions: 1) Location weight is adopted to reduce the influence of non-discriminative patches in the indistinctive area; 2) Between-category differences (DBC) and within-category differences (DWC) are introduced to evaluate the distinction of different patches, which is used to enhance the recognition performance by emphasizing key patches. The paper experimentally demonstrates large improvements over the existing methods for fine-grained as well as position shifted vehicle face recognition.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114880221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating interactive search in videos with image and textual description defined target scenes","authors":"Claudiu Cobârzan","doi":"10.1109/ICMEW.2014.6890550","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890550","url":null,"abstract":"For non-expert users, basic video players with simple controls are often the tools of choice when searching for specific scenes within videos. Basic actions like play, fast-forward, fast-reverse, positioning using the seeker-bar are familiar and are often preferred over advanced retrieval tools when it comes to interactive search and browsing. In order to get a deeper insight into how users approach such tasks with a simple player, we analyzed the search behavior of two groups of 17 users in two different setups. In both setups the users had to locate an approximately 20 seconds long video sequence within an hour long news video. In the first setup, the users were presented with pictures uniformly sampled from within the target scene as well as a textual description. In the second setup, only the sampled pictures were provided for each target. We asses the impact, in terms of search behavior, of identifying target scenes through sampled images, respectively sampled images and textual descriptions.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122005208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tell me what","authors":"Xiansheng Hua, Jin Li","doi":"10.1109/ICMEW.2014.6890616","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890616","url":null,"abstract":"“Tell Me What” is smart phone based image recognition system, and it is also an automatic pipeline for generating image recognition systems to recognize an arbitrary set of entities. For any given set of entities, “Tell Me What” backend system automatically fetches related image data from the Internet for each entity, and then run a comprehensive data cleaning process to purify the data. A multi-class classifier and inverted index are then built based on the cleaned data. For an unknown new image captured by a camera, the user is allowed to optionally highlight regions and then a classification process and a search process are applied to get recognition results. Distributed computing techniques are applied to ensure that the backend model and index generation processes can be done in a few hours.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134058448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Video navigation on tablets with multi-touch gestures","authors":"Klaus Schöffmann, Kevin Chromik, L. Böszörményi","doi":"10.1109/ICMEW.2014.6890560","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890560","url":null,"abstract":"We describe a new interaction method for video navigation on touch-enabled tablet devices, which is based on previous research results and uses context-sensitive swipe gestures. We evaluate our method in a user study with known-item-search tasks in direct comparison to seeker-bar navigation that is commonly used for navigation with video players on tablets and smartphones. Our evaluation results show that users prefer the swipe-based navigation feature over a seeker-bar in terms of convenience and that users can achieve better search performance with this new way of video navigation.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134269616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}