ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Range Guided Depth Refinement and Uncertainty-Aware Aggregation for View Synthesis 视野合成的距离引导深度细化和不确定性感知聚合
Yuan Chang, Yisong Chen, Guoping Wang
{"title":"Range Guided Depth Refinement and Uncertainty-Aware Aggregation for View Synthesis","authors":"Yuan Chang, Yisong Chen, Guoping Wang","doi":"10.1109/ICASSP39728.2021.9413981","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413981","url":null,"abstract":"In this paper, we present a framework of view synthesis, including range guided depth refinement and uncertainty-aware aggregation based novel view synthesis. We first propose a novel depth refinement method to improve the quality and robustness of the depth map reconstruction. To that end, we use a range prior to constrain the estimated depth, which helps us to get more accurate depth information. Then we propose an uncertainty-aware aggregation method for novel view synthesis. We compute the uncertainty of the estimated depth for each pixel, and reduce the influence of pixels whose uncertainty are large when synthesizing novel views. This step helps to reduce some artifacts such as ghost and blur. We validate the performance of our algorithm experimentally, and we show that our approach achieves state-of-the-art performance.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123999494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RGLN: Robust Residual Graph Learning Networks via Similarity-Preserving Mapping on Graphs 基于图上保持相似映射的鲁棒残差图学习网络
Jiaxiang Tang, Xiang Gao, Wei Hu
{"title":"RGLN: Robust Residual Graph Learning Networks via Similarity-Preserving Mapping on Graphs","authors":"Jiaxiang Tang, Xiang Gao, Wei Hu","doi":"10.1109/ICASSP39728.2021.9414792","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414792","url":null,"abstract":"Graph Convolutional Neural Networks (GCNNs) extend CNNs to irregular graph data domain, such as brain networks, citation networks and 3D point clouds. It is critical to identify an appropriate graph for basic operations in GCNNs. Existing methods often manually construct or learn one fixed graph based on known connectivities, which may be sub-optimal. To this end, we propose a residual graph learning paradigm to infer edge connectivities and weights in graphs, which is cast as distance metric learning under a low-rank assumption and a similarity-preserving regularization. In particular, we learn the underlying graph based on similarity-preserving mapping on graphs, which keeps similar nodes close and pushes dissimilar nodes away. Extensive experiments on semi-supervised learning of citation networks and 3D point clouds show that we achieve the state-of-the-art performance in terms of both accuracy and robustness.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124009029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Improving Ultrasound Tongue Contour Extraction Using U-Net and Shape Consistency-Based Regularizer 利用U-Net和基于形状一致性的正则化器改进超声舌形轮廓提取
Ming Feng, Yin Wang, Kele Xu, Huaimin Wang, Bo Ding
{"title":"Improving Ultrasound Tongue Contour Extraction Using U-Net and Shape Consistency-Based Regularizer","authors":"Ming Feng, Yin Wang, Kele Xu, Huaimin Wang, Bo Ding","doi":"10.1109/ICASSP39728.2021.9414420","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414420","url":null,"abstract":"B-mode ultrasound tongue imaging is widely used to visualize the tongue motion, due to its appearing properties. Extracting the tongue surface contour in the B-mode ultrasound image is still a challenge, while it is a prerequisite for further quantitative analysis. Recently, deep learning-based approach has been adopted in this task. However, the standard deep models fail to address faint contour when the ultrasound wave goes parallel to the tongue surface. To address the faint or missing contours in the sequence, we explore the shape consistency-based regularizer, which can take sequential information into account. By incorporating the regularizer, the deep model not only can extract frame-specific contours, but also can enforce the similarity between the contours extracted from adjacent frames. Extensive experiments are conducted both on the synthetic and real ultrasound tongue imaging dataset and the results demonstrate the effectiveness of proposed method. To better promote the research in this field, we have released our code at1.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124056902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On Loss Functions for Deep-Learning Based T60 Estimation 基于深度学习T60估计的损失函数研究
Yuying Li, Yuchen Liu, D. Williamson
{"title":"On Loss Functions for Deep-Learning Based T60 Estimation","authors":"Yuying Li, Yuchen Liu, D. Williamson","doi":"10.1109/ICASSP39728.2021.9414442","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414442","url":null,"abstract":"Reverberation time, T60, directly influences the amount of reverberation in a signal, and its direct estimation may help with dereverberation. Traditionally, T60 estimation has been done using signal processing or probabilistic approaches, until recently where deep-learning approaches have been developed. Unfortunately, the appropriate loss function for training the network has not been adequately determined. In this paper, we propose a composite classification- and regression-based cost function for training a deep neural network that predicts T60 for a variety of reverberant signals. We investigate pure-classification, pure-regression, and combined classification-regression based loss functions, where we additionally incorporate computational measures of success. Our results reveal that our composite loss function leads to the best performance as compared to other loss functions and comparison approaches. We also show that this combined loss function helps with generalization.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124125739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Improved Intra Mode Coding Beyond Av1 超越Av1的改进内模式编码
Yize Jin, Liang Zhao, Xin Zhao, Shangyi Liu, A. Bovik
{"title":"Improved Intra Mode Coding Beyond Av1","authors":"Yize Jin, Liang Zhao, Xin Zhao, Shangyi Liu, A. Bovik","doi":"10.1109/ICASSP39728.2021.9413420","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413420","url":null,"abstract":"In AOMedia Video 1 (AV1), directional intra prediction modes are applied to model local texture patterns that present certain directionality. Each intra prediction direction is represented with a nominal mode index and a delta angle. The delta angle is entropy coded using shared context between luma and chroma, and the context is derived using the associated nominal mode. In this paper, two methods are proposed to further reduce the signaling cost of delta angles: cross-component delta angle coding, and context-adaptive delta angle coding, whereby the cross-component and spatial correlation of the delta angles are explored, respectively. The proposed methods were implemented on top of a recent version of libaom. Experimental results show that the proposed cross-component delta angle coding achieved average 0.4% BD-rate reduction with 4% encoding time saving over all intra configurations. By combining both methods, an average 1.2% BD-rate reduction is achieved.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126482629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
HSAN: A Hierarchical Self-Attention Network for Multi-Turn Dialogue Generation 多回合对话生成的层次自关注网络
Yawei Kong, Lu Zhang, Can Ma, Cong Cao
{"title":"HSAN: A Hierarchical Self-Attention Network for Multi-Turn Dialogue Generation","authors":"Yawei Kong, Lu Zhang, Can Ma, Cong Cao","doi":"10.1109/ICASSP39728.2021.9413753","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413753","url":null,"abstract":"In the multi-turn dialogue system, response generation is not only related to the sentences in context but also relies on the words in each utterance. Although there are lots of methods that pay attention to model words and utterances, there still exist problems such as tending to generate common responses. In this paper, we propose a hierarchical self-attention network, named HSAN, which attends to the important words and utterances in context simultaneously. Firstly, we use the hierarchical encoder to update the word and utterance representations with their position information respectively. Secondly, the response representations are updated by the mask self-attention module in the decoder. Finally, the relevance between utterances and response is computed by another self-attention module and used for the next response decoding process. In terms of automatic metrics and human judgements, experimental results show that HSAN significantly outperforms all baselines on two common public datasets.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125657885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Seizure Detection Using Power Spectral Density via Hyperdimensional Computing 基于超维计算的功率谱密度癫痫检测
Lulu Ge, K. Parhi
{"title":"Seizure Detection Using Power Spectral Density via Hyperdimensional Computing","authors":"Lulu Ge, K. Parhi","doi":"10.1109/ICASSP39728.2021.9414083","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414083","url":null,"abstract":"Hyperdimensional (HD) computing holds promise for classifying two groups of data. This paper explores seizure detection from electroencephalogram (EEG) from subjects with epilepsy using HD computing based on power spectral density (PSD) features. Publicly available intra-cranial EEG (iEEG) data collected from 4 dogs and 8 human patients in the Kaggle seizure detection contest are used in this paper. This paper explores two methods for classification. First, few ranked PSD features from small number of channels from a prior classification are used in the context of HD classification. Second, all PSD features extracted from all channels are used as features for HD classification. It is shown that for about half the subjects small number features outperform all features in the context of HD classification, and for the other half, all features outperform small number of features. HD classification achieves above 95% accuracy for six of the 12 subjects, and between 85-95% accuracy for 4 subjects. For two subjects, the classification accuracy using HD computing is not as good as classical approaches such as support vector machine classifiers.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"46 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125704055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
An Adaptive Non-Linear Process for Under-Determined Virtual Microphone Beamforming 欠定虚拟麦克风波束形成的自适应非线性过程
M. Bekrani, Anh H. T. Nguyen, Andy W. H. Khong
{"title":"An Adaptive Non-Linear Process for Under-Determined Virtual Microphone Beamforming","authors":"M. Bekrani, Anh H. T. Nguyen, Andy W. H. Khong","doi":"10.1109/ICASSP39728.2021.9413813","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413813","url":null,"abstract":"Virtual microphone beamforming techniques are attractive for devices limited by space constraints. These techniques synthesize virtual microphone signals via interpolation algorithms. We propose to extend existing virtual microphone signal interpolation by employing an adaptive non-linear (ANL) process for acoustic beamforming. The proposed ANL based interpolation utilizes a target-presence probability criteria to determine the degree of non-linearity. The beamformer output is then derived using a combination between interpolations during target inactive zones and target active zones. Such combination offers a trade-off between reducing interference and target signal distortion. We apply the proposed ANL-based interpolator to the maximum signal-to-noise ratio (MSNR) beamformer and compare its performance against conventional beamforming and virtual microphone based beamforming methods in under-determined situations.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126011349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dependence-Guided Multi-View Clustering 依赖导向的多视图聚类
Xia Dong, Danyang Wu, F. Nie, Rong Wang, Xuelong Li
{"title":"Dependence-Guided Multi-View Clustering","authors":"Xia Dong, Danyang Wu, F. Nie, Rong Wang, Xuelong Li","doi":"10.1109/ICASSP39728.2021.9414971","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414971","url":null,"abstract":"In this paper, we propose a novel approach called dependence-guided multi-view clustering (DGMC). Our model enhances the dependence between unified embedding learning and clustering, as well as promotes the dependence between unified embedding and embedding of each view. Specifically, DGMC learns a unified embedding and partitions data in a joint fashion, thus the clustering results can be directly obtained. A kernel dependence measure is employed to learn a unified embedding by forcing it to be close to different views, thus the complex dependence among different views can be captured. Moreover, an implicit-weight learning mechanism is provided to ensure the diversity of different views. An efficient algorithm with rigorous convergence analysis is derived to solve the proposed model. Experimental results demonstrate the advantages of the proposed method over the state of the arts on real-world datasets.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"184 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121922384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Linear Multichannel Blind Source Separation based on Time-Frequency Mask Obtained by Harmonic/Percussive Sound Separation 基于谐波/冲击声分离时频掩模的线性多通道盲源分离
Soichiro Oyabu, Daichi Kitamura, K. Yatabe
{"title":"Linear Multichannel Blind Source Separation based on Time-Frequency Mask Obtained by Harmonic/Percussive Sound Separation","authors":"Soichiro Oyabu, Daichi Kitamura, K. Yatabe","doi":"10.1109/ICASSP39728.2021.9413494","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413494","url":null,"abstract":"Determined blind source separation (BSS) extracts the source signals by linear multichannel filtering. Its performance depends on the accuracy of source modeling, and hence existing BSS methods have proposed several source models. Recently, a new determined BSS algorithm that incorporates a time-frequency mask has been proposed. It enables very flexible source modeling because the model is implicitly defined by a mask-generating function. Building up on this framework, in this paper, we propose a unification of determined BSS and harmonic/percussive sound separation (HPSS). HPSS is an important preprocessing for musical applications. By incorporating HPSS, both harmonic and percussive instruments can be accurately modeled for determined BSS. The resultant algorithm estimates the demixing filter using the information obtained by an HPSS method. We also propose a stabilization method that is essential for the proposed algorithm. Our experiments showed that the proposed method outperformed both HPSS and determined BSS methods including independent low-rank matrix analysis.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122275024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信