2017 IEEE International Conference on Multimedia and Expo (ICME)最新文献

筛选
英文 中文
Gait phase classification for in-home gait assessment 家用步态评估的步态阶段分类
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-08-31 DOI: 10.1109/ICME.2017.8019500
Minxiang Ye, Cheng Yang, V. Stanković, L. Stanković, Samuel Cheng
{"title":"Gait phase classification for in-home gait assessment","authors":"Minxiang Ye, Cheng Yang, V. Stanković, L. Stanković, Samuel Cheng","doi":"10.1109/ICME.2017.8019500","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019500","url":null,"abstract":"With growing ageing population, acquiring joint measurements with sufficient accuracy for reliable gait assessment is essential. Additionally, the quality of gait analysis relies heavily on accurate feature selection and classification. Sensor-driven and one-camera optical motion capture systems are becoming increasingly popular in the scientific literature due to their portability and cost-efficacy. In this paper, we propose 12 gait parameters to characterise gait patterns and a novel gait-phase classifier, resulting in comparable classification performance with a state-of-the-art multi-sensor optical motion system. Furthermore, a novel multi-channel time series segmentation method is proposed that maximizes the temporal information of gait parameters improving the final classification success rate after gait event reconstruction. The validation, conducted over 126 experiments on 6 healthy volunteers and 9 stroke patients with handlabelled ground truth gait phases, demonstrates high gait classification accuracy.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130752623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Novel view synthesis with light-weight view-dependent texture mapping for a stereoscopic HMD 基于轻型视依赖纹理映射的新型立体HMD视图合成
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-08-31 DOI: 10.1109/ICME.2017.8019417
Thiwat Rongsirigul, Yuta Nakashima, Tomokazu Sato, N. Yokoya
{"title":"Novel view synthesis with light-weight view-dependent texture mapping for a stereoscopic HMD","authors":"Thiwat Rongsirigul, Yuta Nakashima, Tomokazu Sato, N. Yokoya","doi":"10.1109/ICME.2017.8019417","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019417","url":null,"abstract":"The proliferation of off-the-shelf head-mounted displays (HMDs) let end-users enjoy virtual reality applications, some of which render a real-world scene using a novel view synthesis (NVS) technique. View-dependent texture mapping (VDTM) has been studied for NVS due to its photo-realistic quality. The VDTM technique renders a novel view by adaptively selecting textures from the most appropriate images. However, this process is computationally expensive because VDTM scans every captured image. For stereoscopic HMDs, the situation is much worse because we need to render novel views once for each eye, almost doubling the cost. This paper proposes light-weight VDTM tailored for an HMD. In order to reduce the computational cost in VDTM, our method leverages the overlapping fields of view between a stereoscopic pair of HMD images and pruning the images to be scanned. We show that the proposed method drastically accelerates the VDTM process without spoiling the image quality through a user study.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121355509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Enhancing feature modulation spectra with dictionary learning approaches for robust speech recognition 用字典学习方法增强特征调制谱用于鲁棒语音识别
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-08-28 DOI: 10.1109/ICME.2017.8019509
Bi-Cheng Yan, Chin-Hong Shih, Shih-Hung Liu, Berlin Chen
{"title":"Enhancing feature modulation spectra with dictionary learning approaches for robust speech recognition","authors":"Bi-Cheng Yan, Chin-Hong Shih, Shih-Hung Liu, Berlin Chen","doi":"10.1109/ICME.2017.8019509","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019509","url":null,"abstract":"Noise robustness has long garnered much interest from researchers and practitioners of the automatic speech recognition (ASR) community due to its paramount importance to the success of ASR systems. This paper presents a novel approach to improving the noise robustness of speech features, building on top of the dictionary learning paradigm. To this end, we employ the K-SVD method and its variants to create sparse representations with respect to a common set of basis spectral vectors that captures the intrinsic temporal structure inherent in the modulation spectra of clean training speech features. The enhanced modulation spectra of speech features, constructed by mapping the original modulation spectra into the space spanned by these representative basis vectors, can better carry noise-resistant acoustic characteristics. In addition, considering the nonnegative property of the modulation spectrum amplitudes, we utilize the nonnegative K-SVD method, in combination with the nonnegative sparse coding method, to generate more noise-robust speech features. All experiments were conducted and verified using the standard Aurora-2 database and task. The empirical results show that the proposed dictionary learning based approach can provide significant average word error reductions when being integrated with either a GMM-HMM or a DNN-HMM based ASR system.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133846800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Facial attractiveness computation by label distribution learning with deep CNN and geometric features 基于深度CNN和几何特征的标签分布学习人脸吸引力计算
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-08-28 DOI: 10.1109/ICME.2017.8019454
Shu Liu, Bo Li, Yangyu Fan, Zhe Guo, A. Samal
{"title":"Facial attractiveness computation by label distribution learning with deep CNN and geometric features","authors":"Shu Liu, Bo Li, Yangyu Fan, Zhe Guo, A. Samal","doi":"10.1109/ICME.2017.8019454","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019454","url":null,"abstract":"Facial attractiveness computation is a challenging task because of the lack of labeled data and discriminative features. In this paper, an end-to-end label distribution learning (LDL) framework with deep convolutional neural network (CNN) and geometric features is proposed to meet these two challenges. Different from the previous work, we recast this task as an LDL problem. Compared with the single label regression, the LDL could improve the generalization ability of our model significantly. In addition, we propose some kinds of geometric features as well as an incremental feature selection method, which could select hundred-dimensional discriminative geometric features from an exhaustive pool of raw features. More importantly, we find these selected geometric features are complementary to CNN features. Extensive experiments are carried out on the SCUT-FBP dataset, where our approach achieves superior performance in comparison to the state-of-the-arts.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128659685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Random forest regression based acoustic event detection with bottleneck features 基于随机森林回归的瓶颈特征声事件检测
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-08-28 DOI: 10.1109/ICME.2017.8019418
Xianjun Xia, R. Togneri, Ferdous Sohel, David Huang
{"title":"Random forest regression based acoustic event detection with bottleneck features","authors":"Xianjun Xia, R. Togneri, Ferdous Sohel, David Huang","doi":"10.1109/ICME.2017.8019418","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019418","url":null,"abstract":"This paper deals with random forest regression based acoustic event detection (AED) by combining acoustic features with bottleneck features (BN). The bottleneck features have a good reputation of being inherently discriminative in acoustic signal processing. To deal with the unstructured and complex real-world acoustic events, an acoustic event detection system is constructed using bottleneck features combined with acoustic features. Evaluations were carried out on the UPC-TALP and ITC-Irst databases which consist of highly variable acoustic events. Experimental results demonstrate the usefulness of the low-dimensional and discriminative bottleneck features with relative 5.33% and 5.51% decreases in error rates respectively.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129900327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Quality assessment of multi-view-plus-depth images 多视点加深度图像的质量评估
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-11 DOI: 10.1109/ICME.2017.8019542
Jiheng Wang, Shiqi Wang, Kai Zeng, Zhou Wang
{"title":"Quality assessment of multi-view-plus-depth images","authors":"Jiheng Wang, Shiqi Wang, Kai Zeng, Zhou Wang","doi":"10.1109/ICME.2017.8019542","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019542","url":null,"abstract":"Multi-view-plus-depth (MVD) representation has gained significant attention recently as a means to encode 3D scenes, allowing for intermediate views to be synthesized on-the-fly at the display site through depth-image-based-rendering (DIBR). Automatic quality assessment of MVD images/videos is critical for the optimal design of MVD image/video coding and transmission schemes. Most existing image quality assessment (IQA) and video quality assessment (VQA) methods are applicable only after the DIBR process. Such post-DIBR measures are valuable in assessing the overall system performance, but are difficult to be directly employed in the encoder optimization process in MVD image/video coding. Here we make one of the first attempts to develop a perceptual pre-DIBR IQA approach for MVD images by employing an information content weighted approach that balances between local quality measures of texture and depth images. Experiment results show that the proposed approach achieves competitive performance when compared with state-of-the-art IQA algorithms applied post-DIBR.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114715083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Saliency detection with two-level fully convolutional networks 两级全卷积网络的显著性检测
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019309
Yang Yi, Li Su, Qingming Huang, Zhe Wu, Chunfeng Wang
{"title":"Saliency detection with two-level fully convolutional networks","authors":"Yang Yi, Li Su, Qingming Huang, Zhe Wu, Chunfeng Wang","doi":"10.1109/ICME.2017.8019309","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019309","url":null,"abstract":"This paper proposes a deep architecture for saliency detection by fusing pixel-level and superpixel-level predictions. Different from the previous methods that either make dense pixellevel prediction with complex networks or region-level prediction for each region with fully-connected layers, this paper investigates an elegant route to make two-level predictions based on a same simple fully convolutional network via seamless transformation. In the transformation module, we integrate the low level features to model the similarities between pixels and superpixels as well as superpixels and superpixels. The pixel-level saliency map detects and highlights the salient object well and the superpixel-level saliency map preserves sharp boundary in a complementary way. A shallow fusion net is applied to learn to fuse the two saliency maps, followed by a CRF post-refinement module. Experiments on four benchmark data sets demonstrate that our method performs favorably against the state-of-art methods.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129937889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Remembering history with convolutional LSTM for anomaly detection 用卷积LSTM记忆历史进行异常检测
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019325
Weixin Luo, Wen Liu, Shenghua Gao
{"title":"Remembering history with convolutional LSTM for anomaly detection","authors":"Weixin Luo, Wen Liu, Shenghua Gao","doi":"10.1109/ICME.2017.8019325","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019325","url":null,"abstract":"This paper tackles anomaly detection in videos, which is an extremely challenging task because anomaly is unbounded. We approach this task by leveraging a Convolutional Neural Network (CNN or ConvNet) for appearance encoding for each frame, and leveraging a Convolutional Long Short Term Memory (ConvLSTM) for memorizing all past frames which corresponds to the motion information. Then we integrate ConvNet and ConvLSTM with Auto-Encoder, which is referred to as ConvLSTM-AE, to learn the regularity of appearance and motion for the ordinary moments. Compared with 3D Convolutional Auto-Encoder based anomaly detection, our main contribution lies in that we propose a ConvLSTM-AE framework which better encodes the change of appearance and motion for normal events, respectively. To evaluate our method, we first conduct experiments on a synthesized Moving-MNIST dataset under controlled settings, and results show that our method can easily identify the change of appearance and motion. Extensive experiments on real anomaly datasets further validate the effectiveness of our method for anomaly detection.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125738434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 354
A subjective visual quality assessment method of panoramic videos 全景视频主观视觉质量评价方法
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019351
Mai Xu, Chen Li, Yufan Liu, Xin Deng, Jiaxin Lu
{"title":"A subjective visual quality assessment method of panoramic videos","authors":"Mai Xu, Chen Li, Yufan Liu, Xin Deng, Jiaxin Lu","doi":"10.1109/ICME.2017.8019351","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019351","url":null,"abstract":"Different from 2-dimensional (2D) videos, panoramic videos contain spherical viewing direction with the support of head-mounted displays, thus improving immersive and interactive visual experience. Unfortunately, to our best knowledge, there are few subjective visual quality assessment (VQA) methods for panoramic videos. In this paper, we therefore propose a subjective VQA method for assessing quality loss of impaired panoramic videos. Specifically, we first establish a database containing viewing direction data of several subjects on watching panoramic videos. Then, we find out that there exists high consistency of viewing direction on panoramic videos across different subjects. Upon this finding, we present a procedure of subjective test in measuring quality of panoramic videos by different subjects, yielding different mean opinion score (DMOS). To couple with inconsistency of viewing directions on panoramic videos, we further propose a vectorized DMOS metric. Finally, experimental results verify that our subjective VQA method, in the forms of both overall and vectorized DMOS metrics, is effective in measuring subjective quality of panoramic videos.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132241444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
DLML: Deep linear mappings learning for face super-resolution with nonlocal-patch DLML:基于非局部补丁的人脸超分辨率深度线性映射学习
2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019298
T. Lu, Lanlan Pan, Junjun Jiang, Yanduo Zhang, Zixiang Xiong
{"title":"DLML: Deep linear mappings learning for face super-resolution with nonlocal-patch","authors":"T. Lu, Lanlan Pan, Junjun Jiang, Yanduo Zhang, Zixiang Xiong","doi":"10.1109/ICME.2017.8019298","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019298","url":null,"abstract":"Learning-based face super-resolution approaches rely on representative dictionary as self-similarity prior from training samples to estimate the relationship between the low-resolution (LR) and high-resolution (HR) image patches. The most popular approaches, learn mapping function directly from LR patches to HR ones but neglects the multi-layered nature of image degradation process (resolution down-sampling) which means observed LR images are gradually formed from HR version to lower resolution ones. In this paper, we present a novel deep linear mappings learning framework for face super-resolution to learn the complex relationship between LR features and HR ones by alternately updating multi-layered embedding dictionaries and linear mapping matrices instead of directly mapping. Furthermore, in contrast to existing position based studies that only use local patch for self-similarity prior, we develop a feature-induced nonlocal dictionary pair embedding method to support hierarchical multiple linear mappings learning. With coarse-to-fine nature of deep learning architecture, cascaded incremental linear mappings matrices can be used to exploit the complex relationship between LR and HR images. Experimental results demonstrate that such framework outperforms state-of-the-art (including both general super-resolution approaches and face super-resolution approaches) on FEI face database.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133228810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信