Jordi Sanchez-Riera, Yuan-Sheng Hsiao, Tekoing Lim, K. Hua, Wen-Huang Cheng
{"title":"A robust tracking algorithm for 3D hand gesture with rapid hand motion through deep learning","authors":"Jordi Sanchez-Riera, Yuan-Sheng Hsiao, Tekoing Lim, K. Hua, Wen-Huang Cheng","doi":"10.1109/ICMEW.2014.6890556","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890556","url":null,"abstract":"There are two main problems that make hand gesture tracking especially difficult. One is the great number of degrees of freedom of the hand and the other one is the rapid movements that we make in natural gestures. Algorithms based on minimizing an objective function, with a good initialization, typically obtain good accuracy at low frame rates. However, these methods are very dependent on the initialization point, and fast movements on the hand position or gesture, provokes a lost of track which are unable to recover. We present a method that uses deep learning to train a set of gestures (81 gestures), that will be used as a rough estimate of the hand pose and orientation. This will serve to a registration of non rigid model algorithm that will find the parameters of hand, even when temporal assumption of smooth movements of hands is violated. To evaluate our proposed algorithm, different experiments are performed with some real sequences recorded with Intel depth sensor to demonstrate the performance in a real scenario.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132477401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Kristian Laundav, Camilla Birgitte Falk Jensen, Per Baekgaard, Michael Kai Petersen, J. E. Larsen
{"title":"Your heart might give away your emotions","authors":"David Kristian Laundav, Camilla Birgitte Falk Jensen, Per Baekgaard, Michael Kai Petersen, J. E. Larsen","doi":"10.1109/ICMEW.2014.6890662","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890662","url":null,"abstract":"Estimating emotional responses to pictures based on heart rate measurements: variations in Heart Rate serves as an important clinical health indicator, but potentially also as a window into cognitive reactions to presented stimuli, as a function of both stimuli, context and previous cognitive state. This study looks at single-trial time domain mean Heart Rate (HR) and frequency domain Heart Rate Variability (HRV) measured while subjects were passively viewing emotionally charged images, comparing short random presentations with grouped sequences of either neutral, highly arousing pleasant or highly arousing unpleasant pictures. Based on only a few users we were not able to demonstrate HRV variations that correlated with randomly presented emotional content due to the inherent noise in the signal. Nor could we reproduce results from earlier studies, which based on averaged values over many subjects, revealed small changes in the mean HR only seconds after presentation of emotional images. However for longer sequences of pleasant and unpleasant images, we found a trend in the mean HR that could correlate with the emotional content of the images. Suggesting a potential for using HR in single user Quantified Self applications to assess fluctuations over longer periods in emotional state, rather than dynamic responses to emotional stimuli.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"349 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116499578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimodel emotion analysis in response to multimedia","authors":"Wei-Long Zheng, Jia-Yi Zhu, Bao-Liang Lu","doi":"10.1109/ICMEW.2014.6890622","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890622","url":null,"abstract":"In this demo paper, we designed a novel framework combining EEG and eye tracking signals to analyze users' emotional activities in response to multimedia. To realize the proposed framework, we extracted efficient features of EEG and eye tracking signals and used support vector machine as classifier. We combined multimodel features using feature-level fusion and decision-level fusion to classify three emotional categories (positive, neutral and negative), which can achieve the average accuracies of 75.62% and 74.92%, respectively. We investigated the brain activities that are associated with emotions. Our experimental results indicated there exist stable common patterns and activated areas of the brain associated with positive and negative emotions. In the demo, we also showed the trajectory of emotion changes in response to multimedia.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132280097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaowei Song, Zixiang Xiong, Lei Yang, Zhoufeng Liu
{"title":"Depth-based human body enhancement in the infrared video","authors":"Xiaowei Song, Zixiang Xiong, Lei Yang, Zhoufeng Liu","doi":"10.1109/ICMEW.2014.6890656","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890656","url":null,"abstract":"Based on the depth information acquired by the popular RGBD camera such as Kinect, the human body image areas in the infrared video can be selectively enhanced. In this paper, we firstly utilized the Optimal Contrast-Tone Mapping (OCTM) method instead of Histogram Equalization (HE) method to make a good contrast balance for the infrared video image acquired in a low illumination condition. Secondly, we used multiple iterations of the Level Set algorithm to improve the human body silhouette which initially recognized by the RGBD camera in each infrared frame. Finally, in order to improve the image quality of the human body area in each infrared frame, a fast bilateral filter had been employed to eliminate the spot noise while maintaining good edge features. Experimental results show that the proposed method can effectively enhance the human subjects in the infrared video images.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133443759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new approach for extracting and summarizing abnormal activities in surveillance videos","authors":"Yihao Zhang, Weiyao Lin, Guangwei Zhang, Chuanfei Luo, Dong Jiang, Chunlian Yao","doi":"10.1109/ICMEW.2014.6890537","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890537","url":null,"abstract":"In this paper, we propose a new approach to detect abnormal activities in surveillance videos and create suitable summary videos accordingly. The proposed approach first introduces a blob sequence optimization process which integrates spatial, temporal, size, and motion correlation among objects to extract suitable abnormal blob sequences. With this process, blob extraction errors due to occlusion or background interferences can be effectively avoided. Then, we also propose an abnormality-type-based method which creates short-period summary videos for long-period input surveillance videos by properly arranging abnormal blob sequences according to their activity types. Experimental results show that our proposed approach can effectively create satisfying summary videos from input surveillance videos.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116532402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shenming Qu, R. Hu, Shihong Chen, Liang Chen, Maosheng Zhang
{"title":"Robust face super-resolution via position-patch neighborhood preserving","authors":"Shenming Qu, R. Hu, Shihong Chen, Liang Chen, Maosheng Zhang","doi":"10.1109/ICMEW.2014.6890650","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890650","url":null,"abstract":"By incorporating the priors that human face is a class of highly structured object, position-patch based face hallucination methods represent the test image patch through the same position patches of training faces by employing least square estimation or sparse coding. Due to they cannot provide unbiased approximations or ignore the influence of spatial distances between the test image patch and training basis image patches, the obtained representation is not satisfactory. In this paper, we propose a simpler yet more effective scheme called Position-patch Neighborhood Preserving (PNP). We improve existing SR methods by exploiting locality constraint and shrinkage measures to maintain locality and stability simultaneously. Moreover, our method use less similar patches, face hallucination is fast and robust. Various experimental results on standard face database show that our proposed method outperforms state-of-the-art methods in terms of both objective metrics and visual quality.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123552129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Majdi Rawashdeh, Mohammed F. Alhamid, Heung-Nam Kim, Awny Alnusair, Vanessa Maclsaac, Abdulmotaleb El Saddik
{"title":"Graph-based personalized recommendation in social tagging systems","authors":"Majdi Rawashdeh, Mohammed F. Alhamid, Heung-Nam Kim, Awny Alnusair, Vanessa Maclsaac, Abdulmotaleb El Saddik","doi":"10.1109/ICMEW.2014.6890593","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890593","url":null,"abstract":"In recent years, users of ambient intelligence environments have been overwhelmed by the huge numbers of social media available. Consequentially, users have trouble finding social media suited to their needs. To help users in ambient environment get relevant media tailored to their interests, we propose a new method which adapts the Katz measure, a path-ensemble based proximity measure, for the use in social tagging services. We model the ternary relations among user, resource and tag as a weighted, undirected tripartite graph. We then apply the Katz measure to this graph, and exploit it to provide personalized recommendation for individual users within ambient intelligence environments. The experimental evaluations show that the proposed method improves the recommendation performance compared to existing algorithms.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124727853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Personalized video summarization by highest quality frames","authors":"K. Darabi, G. Ghinea","doi":"10.1109/ICMEW.2014.6890674","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890674","url":null,"abstract":"In this work, a user-centered approach has been the basis for generation of the personalized video summaries. Primarily, the video experts score and annotate the video frames during the enrichment phase. Afterwards, the frames scores for different video segments will be updated based on the captured end-users (different with video experts) priorities towards existing video scenes. Eventually, based on the pre-defined skimming time, the highest scored video frames will be extracted to be included into the personalized video summaries. In order to evaluate the effectiveness of our proposed model, we have compared the video summaries generated by our system against the results from 4 other summarization tools using different modalities.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125452177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Liang, Le Dong, Shanshan Xie, Na Lv, Zongyi Xu
{"title":"Compact feature based clustering for large-scale image retrieval","authors":"Yan Liang, Le Dong, Shanshan Xie, Na Lv, Zongyi Xu","doi":"10.1109/ICMEW.2014.6890597","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890597","url":null,"abstract":"This paper addresses the problem of fast similar image retrieval, especially for large-scale datasets with millions of images. We present a new framework which consists of two dependent algorithms. First, a new feature is proposed to represent images, which is dubbed compact feature based clustering (CFC). For each image, we first extract cluster centers of local features, and then calculate distribution histograms of local features and statistics of spatial information in each cluster to form compact features based clustering, replacing thousands of local features. It can reduce feature vectors of image representation and enhance the discriminative power of each feature. In addition, an efficient retrieval method is proposed, based on vocabulary tree through compact features based clustering. Extensive experiments on the Ukbench, Holidays, and ImageNet databases demonstrate that our method reduces the memory and computation overhead and improves the retrieval efficiency, while keeping approximate state-of-the-art accuracy.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123979419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaolong Jiang, Qin Yu, Kai Zhang, Siwei Ma, H. Qi, S. Lei
{"title":"A flexible reference picture management scheme","authors":"Xiaolong Jiang, Qin Yu, Kai Zhang, Siwei Ma, H. Qi, S. Lei","doi":"10.1109/ICMEW.2014.6890694","DOIUrl":"https://doi.org/10.1109/ICMEW.2014.6890694","url":null,"abstract":"Reference picture management is an important part in video encoding and decoding process, and has great influence on the coding performance. In this paper, a flexible reference picture management scheme is proposed for the second generation of Audio Video Coding Standard (AVS2). In the proposed scheme, reference configuration set (RCS), which consists of reference picture information, is used to manage the reference picture. Based on RCS, the reference picture set for current coding picture can be arbitrarily configured. Experimental results show that the proposed flexible reference picture management scheme achieves significant bitrate reduction in AVS2 encoder. For low delay P and random access common test condition, the average coding gain can be up to 4.3% and 5.1% respectively.","PeriodicalId":178700,"journal":{"name":"2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127266914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}