2020 International Conference on Signal Processing and Communications (SPCOM)最新文献_第7页

Improved Feed Forward Attention Mechanism in Bidirectional Recurrent Neural Networks for Robust Sequence Classification 基于改进前馈注意机制的双向递归神经网络鲁棒序列分类

2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179606

Sai Bharath Chandra Gutha, M. Shaik, Tejas Udayakumar, Ajit Ashok Saunshikhar

{"title":"Improved Feed Forward Attention Mechanism in Bidirectional Recurrent Neural Networks for Robust Sequence Classification","authors":"Sai Bharath Chandra Gutha, M. Shaik, Tejas Udayakumar, Ajit Ashok Saunshikhar","doi":"10.1109/SPCOM50965.2020.9179606","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179606","url":null,"abstract":"Feed Forward Attention (FFA) in Recurrent Neural Networks (RNNs) is a popular attention mechanism to classify sequential data. In Bidirectional RNNs (BiRNNs), FFA concatenates hidden states from forward and backward layers to compute unscaled logits and normalized attention weights at each time step and softmax is applied to the weighted sum of logits to compute posterior probabilities. Such concatenation corresponds to the addition of individual unnormalized attention weights and unscaled logits from forward and backward layers. In this paper, we present a novel attention mechanism called the Improved Feed Forward Attention Mechanism (IFFA), that computes the probabilities and normalized attention weights separately for forward and backward layers without concatenating the hidden states. Finally, weighted probabilities are computed at each time step and averaged across time. Our experimental results show IFFA outperforming FFA in diverse classification tasks such as speech accent, emotion and whisper classification.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126263601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Clustering tendency assessment for datasets having inter-cluster density variations 具有簇间密度变化的数据集的聚类倾向评估

2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179608

Dheeraj Kumar, J. Bezdek

{"title":"Clustering tendency assessment for datasets having inter-cluster density variations","authors":"Dheeraj Kumar, J. Bezdek","doi":"10.1109/SPCOM50965.2020.9179608","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179608","url":null,"abstract":"Clustering tendency assessment, i.e., determining if a dataset has any inherent clusters, and if so, how many clusters, k to seek is a crucial pre-clustering task. The visual assessment of tendency (VAT) and improved visual assessment of tendency (iVAT) algorithms provide a visual way to assess cluster tendency of a dataset by reordering the pairwise dissimilarity matrix so that potential clusters are displayed as dark blocks along the diagonal in the image of the reordered dissimilarity matrix. VAT and iVAT, being distance-based schemes, fail to perform well for datasets consisting of clusters characterized by different density levels. In this paper, we introduce two new members of the VAT family of algorithms: Locally Scaled VAT (LSVAT) and Locally Scaled iVAT (LS-iVAT), which produces better iVAT images for data having inter-cluster density variations. Numerical experiments comparing the proposed novel approach with baseline VAT/iVAT as well as spectral clustering and density-based clustering algorithms establish that LS-VAT and LS-iVAT are superior to the comparable algorithms in terms of clustering quality.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114672661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

End-to-end audio-scene classification from raw audio: Multi time-frequency resolution CNN architecture for efficient representation learning 原始音频的端到端音频场景分类:用于高效表示学习的多时频分辨率CNN架构

2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179600

T. V. Kumar, R. Sundar, Tilak Purohit, V. Ramasubramanian

{"title":"End-to-end audio-scene classification from raw audio: Multi time-frequency resolution CNN architecture for efficient representation learning","authors":"T. V. Kumar, R. Sundar, Tilak Purohit, V. Ramasubramanian","doi":"10.1109/SPCOM50965.2020.9179600","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179600","url":null,"abstract":"We propose and study a novel multi-temporal CNN architecture for end-to-end ‘audio-scene classification’ (ASC) from raw audio signal. Conventional CNNs use a fixed size kernel (whether for image or 1-d signal classification) which corresponds to applying a filter bank, where each filter has a fixed time-frequency resolution (i.e., fixed duration impulse response and a fixed band-width frequency response), importantly with a specific time-frequency trade-off. In contrast, in a way to allow for multiple time-frequency resolutions, we use a multi-temporal CNN architecture having multiple kernel branches (up to 12 branches) each of different lengths, thereby allowing for multiple filter banks with different time-frequency resolution to process the input raw audio signal and create feature-maps (e.g. ranging from very narrow-band to very wide-band spectrographic maps in steps of fine time-frequency resolutions) corresponding to different time-frequency trade-offs. Applying this architecture to end-to-end audio-scene classification is shown to offer consistent and significant performance enhancements (e.g. 11-15% absolute in accuracy for the multi-temporal case of 12 branches) over the conventional single-temporal CNN and also outperform state-of the-art results for this task.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124422531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Classification of Social Signals Using Deep LSTM-based Recurrent Neural Networks 基于深度lstm的递归神经网络的社会信号分类

2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179516

Himanshu Joshi, Ananya Verma, Amrita Mishra

引用次数: 4

Classifying Cultural Music using Melodic Features 用旋律特征对文化音乐进行分类

2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179597

Amruta Vidwans, Prateek Verma, P. Rao

{"title":"Classifying Cultural Music using Melodic Features","authors":"Amruta Vidwans, Prateek Verma, P. Rao","doi":"10.1109/SPCOM50965.2020.9179597","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179597","url":null,"abstract":"We present melody based classification of musical styles by exploiting pitch and energy based characteristics computed on the audio signal. Three prominent musical styles were chosen which have improvisation as an integral part with similar melodic principles, theme, and structure of concerts namely, Hindustani, Carnatic and Turkish music. Listeners of one or more of these genres can discriminate these entirely based on the melodic style. The resynthesized melody of music pieces that share the underlying raga/makam, removing any singer cues, was used to validate our hypothesis that style distinction is embedded in the melody. Our automatic method is based on finding a set of highly discriminatory features, motivated by musicological knowledge, to capture distinct characteristics of the melodic contour. The nature of transitions in the pitch contour, presence of microtonal notes and the dynamic variations in the vocal energy are exploited. The automatically classified style labels are found to correlate well with the judgments of human listeners. The melody based features when combined with timbre based features, were found to improve the classification performance on the music metadata based genre labels.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114638986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A Low Complexity Detector with Near-ML Performance for Generalized Differential Spatial Modulation 一种具有近ml性能的低复杂度检测器用于广义差分空间调制

2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179552

Deepak Jose, S. Sameer

引用次数: 2

Multi-target hybrid CTC-Attentional Decoder for joint phoneme-grapheme recognition 多目标ctc -注意混合译码器联合音素-字素识别

2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179603

Shreekantha Nadig, V. Ramasubramanian, Sachit Rao

引用次数: 6

Luminance Channel Based Camera Model Identification 基于亮度通道的摄像机模型识别

2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179564

Nayan Moni Baishya, P. Bora

{"title":"Luminance Channel Based Camera Model Identification","authors":"Nayan Moni Baishya, P. Bora","doi":"10.1109/SPCOM50965.2020.9179564","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179564","url":null,"abstract":"Camera model identification is an active research problem because of its importance in investigating the source and the authenticity of an image. Traditional camera model identification methods are based on strategies to extract the low-level traces left by the image acquisition pipeline of a camera on an image. One such intrinsic and camera-specific trace is the sensor pattern noise (SPN). The SPN is roughly approximated from the noise-residual obtained by performing high-pass filtering on an image. The noise-residual of an image also contains information about other types of noises. The extraction of the noise-residuals is generally performed on a single primary color channel, like the green channel of an image. However, the performance of a channel in the YCbCr color space is never explored. In this paper, we have proposed a novel camera model identification method based on convolutional neural network, where the noise-residuals are extracted from the luminance (Y) channel of the images. A constrained convolutional layer learns data-driven high-pass filters to extract the noise-residuals and the following layers learn a feature representation for the classification task. We have conducted experiments with multiple class combinations from the Dresden image database. The experimental results show the effectiveness of the Y channel for camera model identification both in terms of classification accuracy and convergence of the network.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116144963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

SPCOM 2020 Contents SPCOM 2020目录

2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/spcom50965.2020.9179531

引用次数: 0

Semi-supervised learning for acoustic model retraining: Handling speech data with noisy transcript 声学模型再训练的半监督学习:带噪声文本的语音数据处理

2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179517

Abhijith Madan, Ayush Khopkar, Shreekantha Nadig, M. SrinivasaRaghavanK., Dhanya Eledath, V. Ramasubramanian

引用次数: 2