2017 IEEE International Conference on Multimedia and Expo (ICME)最新文献

Gait phase classification for in-home gait assessment 家用步态评估的步态阶段分类

2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-08-31 DOI: 10.1109/ICME.2017.8019500

Minxiang Ye, Cheng Yang, V. Stanković, L. Stanković, Samuel Cheng

引用次数: 10

Novel view synthesis with light-weight view-dependent texture mapping for a stereoscopic HMD 基于轻型视依赖纹理映射的新型立体HMD视图合成

2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-08-31 DOI: 10.1109/ICME.2017.8019417

Thiwat Rongsirigul, Yuta Nakashima, Tomokazu Sato, N. Yokoya

引用次数: 2

Enhancing feature modulation spectra with dictionary learning approaches for robust speech recognition 用字典学习方法增强特征调制谱用于鲁棒语音识别

2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-08-28 DOI: 10.1109/ICME.2017.8019509

Bi-Cheng Yan, Chin-Hong Shih, Shih-Hung Liu, Berlin Chen

{"title":"Enhancing feature modulation spectra with dictionary learning approaches for robust speech recognition","authors":"Bi-Cheng Yan, Chin-Hong Shih, Shih-Hung Liu, Berlin Chen","doi":"10.1109/ICME.2017.8019509","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019509","url":null,"abstract":"Noise robustness has long garnered much interest from researchers and practitioners of the automatic speech recognition (ASR) community due to its paramount importance to the success of ASR systems. This paper presents a novel approach to improving the noise robustness of speech features, building on top of the dictionary learning paradigm. To this end, we employ the K-SVD method and its variants to create sparse representations with respect to a common set of basis spectral vectors that captures the intrinsic temporal structure inherent in the modulation spectra of clean training speech features. The enhanced modulation spectra of speech features, constructed by mapping the original modulation spectra into the space spanned by these representative basis vectors, can better carry noise-resistant acoustic characteristics. In addition, considering the nonnegative property of the modulation spectrum amplitudes, we utilize the nonnegative K-SVD method, in combination with the nonnegative sparse coding method, to generate more noise-robust speech features. All experiments were conducted and verified using the standard Aurora-2 database and task. The empirical results show that the proposed dictionary learning based approach can provide significant average word error reductions when being integrated with either a GMM-HMM or a DNN-HMM based ASR system.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"412 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133846800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Facial attractiveness computation by label distribution learning with deep CNN and geometric features 基于深度CNN和几何特征的标签分布学习人脸吸引力计算

2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-08-28 DOI: 10.1109/ICME.2017.8019454

Shu Liu, Bo Li, Yangyu Fan, Zhe Guo, A. Samal

引用次数: 17

Random forest regression based acoustic event detection with bottleneck features 基于随机森林回归的瓶颈特征声事件检测

2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-08-28 DOI: 10.1109/ICME.2017.8019418

Xianjun Xia, R. Togneri, Ferdous Sohel, David Huang

引用次数: 12

Quality assessment of multi-view-plus-depth images 多视点加深度图像的质量评估

2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-11 DOI: 10.1109/ICME.2017.8019542

Jiheng Wang, Shiqi Wang, Kai Zeng, Zhou Wang

引用次数: 6

Saliency detection with two-level fully convolutional networks 两级全卷积网络的显著性检测

2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019309

Yang Yi, Li Su, Qingming Huang, Zhe Wu, Chunfeng Wang

引用次数: 5

Remembering history with convolutional LSTM for anomaly detection 用卷积LSTM记忆历史进行异常检测

2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019325

Weixin Luo, Wen Liu, Shenghua Gao

引用次数: 354

A subjective visual quality assessment method of panoramic videos 全景视频主观视觉质量评价方法

2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019351

Mai Xu, Chen Li, Yufan Liu, Xin Deng, Jiaxin Lu

{"title":"A subjective visual quality assessment method of panoramic videos","authors":"Mai Xu, Chen Li, Yufan Liu, Xin Deng, Jiaxin Lu","doi":"10.1109/ICME.2017.8019351","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019351","url":null,"abstract":"Different from 2-dimensional (2D) videos, panoramic videos contain spherical viewing direction with the support of head-mounted displays, thus improving immersive and interactive visual experience. Unfortunately, to our best knowledge, there are few subjective visual quality assessment (VQA) methods for panoramic videos. In this paper, we therefore propose a subjective VQA method for assessing quality loss of impaired panoramic videos. Specifically, we first establish a database containing viewing direction data of several subjects on watching panoramic videos. Then, we find out that there exists high consistency of viewing direction on panoramic videos across different subjects. Upon this finding, we present a procedure of subjective test in measuring quality of panoramic videos by different subjects, yielding different mean opinion score (DMOS). To couple with inconsistency of viewing directions on panoramic videos, we further propose a vectorized DMOS metric. Finally, experimental results verify that our subjective VQA method, in the forms of both overall and vectorized DMOS metrics, is effective in measuring subjective quality of panoramic videos.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"34 23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132241444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 59

DLML: Deep linear mappings learning for face super-resolution with nonlocal-patch DLML:基于非局部补丁的人脸超分辨率深度线性映射学习

2017 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2017-07-10 DOI: 10.1109/ICME.2017.8019298

T. Lu, Lanlan Pan, Junjun Jiang, Yanduo Zhang, Zixiang Xiong

{"title":"DLML: Deep linear mappings learning for face super-resolution with nonlocal-patch","authors":"T. Lu, Lanlan Pan, Junjun Jiang, Yanduo Zhang, Zixiang Xiong","doi":"10.1109/ICME.2017.8019298","DOIUrl":"https://doi.org/10.1109/ICME.2017.8019298","url":null,"abstract":"Learning-based face super-resolution approaches rely on representative dictionary as self-similarity prior from training samples to estimate the relationship between the low-resolution (LR) and high-resolution (HR) image patches. The most popular approaches, learn mapping function directly from LR patches to HR ones but neglects the multi-layered nature of image degradation process (resolution down-sampling) which means observed LR images are gradually formed from HR version to lower resolution ones. In this paper, we present a novel deep linear mappings learning framework for face super-resolution to learn the complex relationship between LR features and HR ones by alternately updating multi-layered embedding dictionaries and linear mapping matrices instead of directly mapping. Furthermore, in contrast to existing position based studies that only use local patch for self-similarity prior, we develop a feature-induced nonlocal dictionary pair embedding method to support hierarchical multiple linear mappings learning. With coarse-to-fine nature of deep learning architecture, cascaded incremental linear mappings matrices can be used to exploit the complex relationship between LR and HR images. Experimental results demonstrate that such framework outperforms state-of-the-art (including both general super-resolution approaches and face super-resolution approaches) on FEI face database.","PeriodicalId":330977,"journal":{"name":"2017 IEEE International Conference on Multimedia and Expo (ICME)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133228810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12