ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献_第5页

Double-DCCCAE: Estimation of Body Gestures From Speech Waveform 双dcccae:从语音波形中估计肢体动作

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414660

Jinhong Lu, Tianhang Liu, Shuzhuang Xu, H. Shimodaira

{"title":"Double-DCCCAE: Estimation of Body Gestures From Speech Waveform","authors":"Jinhong Lu, Tianhang Liu, Shuzhuang Xu, H. Shimodaira","doi":"10.1109/ICASSP39728.2021.9414660","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414660","url":null,"abstract":"This paper presents an approach for body-motion estimation from audio-speech waveform, where context information in both input and output streams is taken in to account without using recurrent models. Previous works commonly use multiple frames of input to estimate one frame of motion data, where the temporal information of the generated motion is little considered. To resolve the problems, we extend our previous work and propose a system, double deep canonical-correlation-constrained autoencoder (D-DCCCAE), which encodes each of speech and motion segments into fixed-length embedded features that are well correlated with the segments of the other modality. The learnt motion embedded feature is estimated from the learnt speech-embedded feature through a simple neural network and further decoded back to the sequential motion. The proposed pair of embedded features showed higher correlation than spectral features with motion data, and our model was more preferred than the baseline model (BA) in terms of human-likeness and comparable in terms of similar appropriateness.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122187130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Non-Convex Sparse Deviation Modeling Via Generative Models 基于生成模型的非凸稀疏偏差建模

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414170

Yaxi Yang, Hailin Wang, Haiquan Qiu, Jianjun Wang, Yao Wang

引用次数: 1

An Adaptive Pyramid Single-View Depth Lookup Table Coding Method 一种自适应金字塔单视图深度查找表编码方法

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414584

Yangang Cai, Ronggang Wang, Song Gu, Jian Zhang, Wen Gao

引用次数: 1

Instrument Classification of Solo Sheet Music Images 独奏乐谱图像的乐器分类

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9413732

Kevin Ji, Daniel Yang, T. Tsai

引用次数: 2

Real-Time Radio Modulation Classification With An LSTM Auto-Encoder 基于LSTM自编码器的实时无线电调制分类

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414351

Ziqi Ke, H. Vikalo

{"title":"Real-Time Radio Modulation Classification With An LSTM Auto-Encoder","authors":"Ziqi Ke, H. Vikalo","doi":"10.1109/ICASSP39728.2021.9414351","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414351","url":null,"abstract":"Identifying modulation type of a received radio signal is a challenging problem encountered in many applications including radio interference mitigation and spectrum allocation. This problem is rendered challenging by the existence of a large number of modulation schemes and numerous sources of interference. Existing methods for monitoring spectrum readily collect large amounts of radio signals. However, existing state-of-the-art approaches to modulation classification struggle to reach desired levels of accuracy with computational efficiency practically feasible for implementation on low-cost computational platforms. To this end, we propose a learning framework based on an LSTM denoising autoencoder designed to extract robust and stable features from the noisy received signals, and detect the underlying modulation scheme. The method uses a compact architecture that may be implemented on low-cost computational devices while achieving or exceeding state-of-the-art classification accuracy. Experimental results on realistic synthetic and over-the-air radio data show that the proposed framework reliably and efficiently classifies radio signals, and often significantly outperform state-of-the-art approaches.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"84 Pt 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129006633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Applied Methods for Sparse Sampling of Head-Related Transfer Functions 头相关传递函数稀疏采样的应用方法

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9413976

Lior Arbel, Z. Ben-Hur, D. Alon, B. Rafaely

引用次数: 0

Multi-Scale Residual Network for Covid-19 Diagnosis Using Ct-Scans 基于ct扫描的Covid-19多尺度残差网络诊断

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414426

Pratyush Garg, R. Ranjan, Kamini Upadhyay, M. Agrawal, D. Deepak

引用次数: 20

Improving Dialogue Response Generation Via Knowledge Graph Filter 基于知识图过滤器的对话响应生成改进

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414324

Yanmeng Wang, Ye Wang, Xingyu Lou, Wenge Rong, Zhenghong Hao, Shaojun Wang

引用次数: 4

Perceptual Quality Assessment for Recognizing True and Pseudo 4k Content 识别真实和伪4k内容的感知质量评估

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414932

Wenhan Zhu, Guangtao Zhai, Xiongkuo Min, Xiaokang Yang, Xiao-Ping Zhang

{"title":"Perceptual Quality Assessment for Recognizing True and Pseudo 4k Content","authors":"Wenhan Zhu, Guangtao Zhai, Xiongkuo Min, Xiaokang Yang, Xiao-Ping Zhang","doi":"10.1109/ICASSP39728.2021.9414932","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414932","url":null,"abstract":"To meet the imperative demand for monitoring the quality of Ultra High-Definition (UHD) content in multimedia industries, we propose an efficient no-reference (NR) image quality assessment (IQA) metric to distinguish original and pseudo 4K contents and measure the quality of their quality in this paper. First, we establish a database including more than 3000 4K images composed of natural 4K images together with upscaled versions interpolated from 1080p and 720p images by fourteen algorithms. To improve computing efficiency, our model segments the input image and selects three representative patches by local variances. Then, we extract the histogram features and cut-off frequency features in the frequency domain as well as the natural scenes statistic (NSS) based features from the representative patches. Finally, we employ support vector regressor (SVR) to aggregate these extracted features as an overall quality metric to predict the quality score of the target image. Extensive experimental comparisons using seven common evaluation indicators demonstrate that the proposed model outperforms the competitive NR IQA methods and has a great ability to distinguish true and pseudo 4K images.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124224000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Checking PRNU Usability on Modern Devices 检查PRNU在现代设备上的可用性

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9413611

C. Albisani, Massimo Iuliani, Alessandro Piva

{"title":"Checking PRNU Usability on Modern Devices","authors":"C. Albisani, Massimo Iuliani, Alessandro Piva","doi":"10.1109/ICASSP39728.2021.9413611","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413611","url":null,"abstract":"The image source identification task is mainly addressed by exploiting the unique traces of the sensor pattern noise, that ensure a negligible false alarm rate when comparing patterns extracted from different devices, even of the same brand or model. However, most recent smartphones are equipped with proprietary in-camera processing that can possibly expose unexpected correlated patterns within images belonging to different sensors.In this paper, we first highlight that wrong source attribution can happen on smartphones belonging to the same brand when images are acquired both in default and in bokeh mode. While the bokeh mode is proved to introduce a correlated pattern due to the specific in-camera post-processing, we also show that natural images also expose such issue, even when a reference from flat images is available. Furthermore, different camera models expose different correlation patterns since they are reasonably related to developers’ choices. Then, we propose a general strategy that allows the forensic practitioner to determine whether a questioned device may suffer from these correlated patterns, thus avoiding the risk of false image attribution.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121192517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4