ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Full-Duplex Multifunction Transceiver with Joint Constant Envelope Transmission and Wideband Reception 具有联合恒包络传输和宽带接收的全双工多功能收发器
Jaakko Marin, Micael Bernhardt, T. Riihonen
{"title":"Full-Duplex Multifunction Transceiver with Joint Constant Envelope Transmission and Wideband Reception","authors":"Jaakko Marin, Micael Bernhardt, T. Riihonen","doi":"10.1109/ICASSP39728.2021.9413725","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413725","url":null,"abstract":"This paper introduces and justifies a novel system concept that consists of full-duplex transceivers and uses a multifunction signal for simultaneous two-way communication, jamming and sensing tasks. The proposed device structure and wave-form enable simple-yet-effective interference suppression at the cost of being limited to constant-envelope transmission— this is a weakness only for the communication functionality that becomes limited to frequency-shift keying (FSK) while frequency-modulated continuous wave (FMCW) waveforms are effective for jamming and sensing purposes. We show how the transmission and reception as well as different interference and distortion compensation procedures are implemented in such multifunction transceivers. The system could be also applied for simultaneous spectrum monitoring with the above functions. Finally, we showcase the expected performance of such a system through numerical results.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127918619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
UTDN: An Unsupervised Two-Stream Dirichlet-Net for Hyperspectral Unmixing UTDN:用于高光谱解混的无监督双流Dirichlet-Net
Qiwen Jin, Yong Ma, Xiaoguang Mei, Hao Li, Jiayi Ma
{"title":"UTDN: An Unsupervised Two-Stream Dirichlet-Net for Hyperspectral Unmixing","authors":"Qiwen Jin, Yong Ma, Xiaoguang Mei, Hao Li, Jiayi Ma","doi":"10.1109/ICASSP39728.2021.9414810","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414810","url":null,"abstract":"Recently, the learning-based method has received much attention in the unsupervised hyperspectral unmixing, yet their ability to extract physically meaningful endmembers remains limited and the performance has not been satisfactory. In this paper, we propose a novel two-stream Dirichlet-net, termed as uTDN, to address the above problems. The weight-sharing architecture makes it possible to transfer the intrinsic properties of the endmembers during the process of unmixing, which can help to correct the network converging towards a more accurate and interpretable unmixing solution. Besides, the stick-breaking process is adopted to encourage the latent representation to follow a Dirichlet distribution, where the physical property of the estimated abundance can be naturally incorporated. Extensive experiments on both synthetic and real hyperspectral data demonstrate that the proposed uTDN can outperform the other state-of-the-art approaches.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128177823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detecting Alzheimer’s Disease from Speech Using Neural Networks with Bottleneck Features and Data Augmentation 基于瓶颈特征和数据增强的神经网络语音检测阿尔茨海默病
Zhaoci Liu, Zhiqiang Guo, Zhenhua Ling, Yunxia Li
{"title":"Detecting Alzheimer’s Disease from Speech Using Neural Networks with Bottleneck Features and Data Augmentation","authors":"Zhaoci Liu, Zhiqiang Guo, Zhenhua Ling, Yunxia Li","doi":"10.1109/ICASSP39728.2021.9413566","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413566","url":null,"abstract":"This paper presents a method of detecting Alzheimer’s disease (AD) from the spontaneous speech of subjects in a picture description task using neural networks. This method does not rely on the manual transcriptions and annotations of a subject’s speech, but utilizes the bottleneck features extracted from audio using an ASR model. The neural network contains convolutional neural network (CNN) layers for local context modeling, bidirectional long shortterm memory (BiLSTM) layers for global context modeling and an attention pooling layer for classification. Furthermore, a masking- based data augmentation method is designed to deal with the data scarcity problem. Experiments on the DementiaBank dataset show that the detection accuracy of our proposed method is 82.59%, which is better than the baseline method based on manually-designed acoustic features and support vector machines (SVM), and achieves the state-of-the-art performance of detecting AD using only audio data on this dataset.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128196130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Decomposing Textures using Exponential Analysis 使用指数分析分解纹理
Yuan Hou, A. Cuyt, Wen-shin Lee, Deepayan Bhowmik
{"title":"Decomposing Textures using Exponential Analysis","authors":"Yuan Hou, A. Cuyt, Wen-shin Lee, Deepayan Bhowmik","doi":"10.1109/ICASSP39728.2021.9413909","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413909","url":null,"abstract":"Decomposition is integral to most image processing algorithms and often required in texture analysis. We present a new approach using a recent 2-dimensional exponential analysis technique. Exponential analysis offers the advantage of sparsity in the model and continuity in the parameters. This results in a much more compact representation of textures when compared to traditional Fourier or wavelet transform techniques. Our experiments include synthetic as well as real texture images from standard benchmark datasets. The results outperform FFT in representing texture patterns with significantly fewer terms while retaining RMSE values after reconstruction. The underlying periodic complex exponential model works best for texture patterns that are homogeneous. We demonstrate the usefulness of the method in two common vision processing application examples, namely texture classification and defect detection.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115856615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Periodic Signal Denoising: An Analysis-Synthesis Framework Based on Ramanujan Filter Banks and Dictionaries 周期信号去噪:基于拉马努金滤波器组和字典的分析-综合框架
Pranav Kulkarni, P. Vaidyanathan
{"title":"Periodic Signal Denoising: An Analysis-Synthesis Framework Based on Ramanujan Filter Banks and Dictionaries","authors":"Pranav Kulkarni, P. Vaidyanathan","doi":"10.1109/ICASSP39728.2021.9413689","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413689","url":null,"abstract":"Ramanujan filter banks (RFB) have in the past been used to identify periodicities in data. These are analysis filter banks with no synthesis counterpart for perfect reconstruction of the original signal, so they have not been useful for denoising periodic signals. This paper proposes to use a hybrid analysis-synthesis framework for denoising discrete-time periodic signals. The synthesis occurs via a pruned dictionary designed based on the output energies of the RFB analysis filters. A unique property of the framework is that the denoised output signal is guaranteed to be periodic unlike any of the other methods. For a large range of input noise levels, the proposed approach achieves a stable and high SNR gain outperforming many traditional denoising techniques.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131985440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Large-Scale Chinese Long-Text Extractive Summarization Corpus 大型中文长文本抽取摘要语料库
Kai Chen, Guanyu Fu, Qingcai Chen, Baotian Hu
{"title":"A Large-Scale Chinese Long-Text Extractive Summarization Corpus","authors":"Kai Chen, Guanyu Fu, Qingcai Chen, Baotian Hu","doi":"10.1109/ICASSP39728.2021.9414946","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414946","url":null,"abstract":"Recently, large-scale datasets have vastly facilitated the development in nearly domains of Natural Language Processing. However, lacking large scale Chinese corpus is still a critical bottleneck for further research on deep text summarization methods. In this paper, we publish a large-scale Chinese Long-text Extractive Summarization corpus named CLES. The CLES contains about 104K pairs, which is originally collected from Sina Weibo1. To verify the quality of the corpus, we also manually tagged the relevance score of 5,000 pairs. Our benchmark models on the proposed corpus include conventional deep learning based extractive models and several pre-trained Bert-based algorithms. Their performances are reported and briefly analyzed to facilitate further research on the corpus. We will release this corpus for further research2.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132211851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Drawing Order Recovery from Trajectory Components 从轨迹组件中恢复绘制顺序
Minghao Yang, Xukang Zhou, Yangchang Sun, Jinglong Chen, Baohua Qiang
{"title":"Drawing Order Recovery from Trajectory Components","authors":"Minghao Yang, Xukang Zhou, Yangchang Sun, Jinglong Chen, Baohua Qiang","doi":"10.1109/ICASSP39728.2021.9413542","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413542","url":null,"abstract":"In spite of widely discussed, drawing order recovery (DOR) from static images is still a great challenge task. Based on the idea that drawing trajectories are able to be recovered by connecting their trajectory components in correct orders, this work proposes a novel DOR method from static images. The method contains two steps: firstly, we adopt a convolution neural network (CNN) to predict the next possible drawing components, which is able to covert the components in images to their reasonable sequences. We denote this architecture as Im2Seq-CNN; secondly, considering possible errors exist in the reasonable sequences generated by the first step, we construct a sequence to order structure (Seq2Order) to adjust the sequences to the correct orders. The main contributions include: (1) the Img2Seq-CNN step considers DOR from components instead of traditional pixels one by one along trajectories, which contributes to static images to component sequences; (2) the Seq2Order step adopts image position codes instead of traditional points’ coordinates in its encoder-decoder gated recurrent neural network (GRU-RNN). The proposed method is experienced on two well-known open handwriting databases, and yields robust and competitive results on handwriting DOR tasks compared to the state-of-arts.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132349415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Structure-Guided and Sparse-Representation-Based 3d Seismic Inversion Method 一种结构导向的稀疏表示三维地震反演方法
B. She, Yaojun Wang, Guang Hu
{"title":"A Structure-Guided and Sparse-Representation-Based 3d Seismic Inversion Method","authors":"B. She, Yaojun Wang, Guang Hu","doi":"10.1109/ICASSP39728.2021.9415071","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9415071","url":null,"abstract":"Existing seismic inversion methods are usually 1D, mainly focusing on improving the vertical resolution of inversion results. A few 2D or 3D inversion techniques are either too simple and lack the consideration of stratigraphic structures, or are too complicated which need to extract dip information and solve a complex constrained optimization problem. In this work, with the help of gradient structure tensor (GST) and dictionary learning and sparse representation (DLSR) technologies, we propose a 3D inversion approach (GST-DLSR) that considers both vertical and horizontal structural constraints. In the vertical direction, we investigate the vertical structural features of subsurface models from well-log data by DLSR. In the horizontal direction, we obtain the stratigraphic structural features from a 3D seismic image by GST. We then apply the acquired structural features to constraint the entire inversion procedure. The experiments show that GST-DLSR takes good advantages of both techniques, enabling to produce inversion results with high resolution, good lateral continuity, and enhanced structural features.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132366413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evolving Quantized Neural Networks for Image Classification Using A Multi-Objective Genetic Algorithm 基于多目标遗传算法的演化量化神经网络图像分类
Yong Wang, Xiaojing Wang, Xiaoyu He
{"title":"Evolving Quantized Neural Networks for Image Classification Using A Multi-Objective Genetic Algorithm","authors":"Yong Wang, Xiaojing Wang, Xiaoyu He","doi":"10.1109/ICASSP39728.2021.9413519","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413519","url":null,"abstract":"Recently, many model quantization approaches have been investigated to reduce the model size and improve the inference speed of convolutional neural networks (CNNs). However, these approaches usually inevitably lead to a decrease in classification accuracy. To address this problem, this paper proposes a mixed precision quantization method combined with channel expansion of CNNs by using a multi-objective genetic algorithm, called MOGAQNN. In MOGAQNN, each individual in the population is used to encode a mixed precision quantization policy and a channel expansion policy. During the evolution process, the two polices are optimized simultaneously by the non-dominated sorting genetic algorithm II (NSGA-II). Finally, we choose the best individual in the last population and evaluate its performance on the test set as the final performance. The experimental results of five popular CNNs on two benchmark datasets demonstrate that MOGAQNN can greatly reduce the model size and improve the classification accuracy at the same time.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132416133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Investigation of Using Hybrid Modeling Units for Improving End-to-End Speech Recognition System 基于混合建模单元的端到端语音识别系统改进研究
Shunfei Chen, Xinhui Hu, Sheng Li, Xinkang Xu
{"title":"An Investigation of Using Hybrid Modeling Units for Improving End-to-End Speech Recognition System","authors":"Shunfei Chen, Xinhui Hu, Sheng Li, Xinkang Xu","doi":"10.1109/ICASSP39728.2021.9414598","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414598","url":null,"abstract":"The acoustic modeling unit is crucial for an end-to-end speech recognition system, especially for the Mandarin language. Until now, most of the studies on Mandarin speech recognition focused on individual units, and few of them paid attention to using a combination of these units. This paper uses a hybrid of the syllable, Chinese character, and subword as the modeling units for the end-to-end speech recognition system based on the CTC/attention multi-task learning. In this approach, the character-subword unit is assigned to train the transformer model in the main task learning stage. In contrast, the syllable unit is assigned to enhance the transformer’s shared encoder in the auxiliary task stage with the Connectionist Temporal Classification (CTC) loss function. The recognition experiments were conducted on AISHELL-1 and an open data set of 1200-hour Mandarin speech corpus collected from the OpenSLR, respectively. The experimental results demonstrated that using the syllable-char-subword hybrid modeling unit can achieve better performances than the conventional units of char-subword, and 6.6% relative CER reduction on our 1200-hour data. The substitution error also achieves a considerable reduction.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132462606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信