2019 National Conference on Communications (NCC)最新文献_第6页

Modelling Glottal Flow Derivative Signal for Detection of Replay Speech Samples 基于声门流导数信号的重放语音样本检测

2019 National Conference on Communications (NCC) Pub Date : 2019-02-01 DOI: 10.1109/NCC.2019.8732249

Jagabandhu Mishra, D. Pati, S. Prasanna

{"title":"Modelling Glottal Flow Derivative Signal for Detection of Replay Speech Samples","authors":"Jagabandhu Mishra, D. Pati, S. Prasanna","doi":"10.1109/NCC.2019.8732249","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732249","url":null,"abstract":"It is a widely known fact that automatic speaker verification systems are quite vulnerable to replay speech. The present work deals with detecting replay speech by using the information available in glottal flow derivative (GFD) signal. In signal processing terms, the speech signal can be represented as the response of a vocal-tract system with excited by a excitation source in the form of glottal flow. The effect of record and replay devices distorted the spectral characteristics of the naturally uttered speech sample, resulting distortion in corresponding GFD signals. In this work the GFD signals are parameterized by using standard mel filters and Gaussian mixtures models are made for detection. Although various methods are available, by correlation analysis it is observed that in the context of the present work the dynamic programming phase slope algorithm (DYPSA) method is relatively more effective in estimating the GFD signals. The experimental studies are made on ASVSpoof2017 database. The proposed glottal flow derivative mel frequency cepstral coefficients (GFDMFCC) feature provides 20.53% equal error rate (EER). This performance is comparatively poor than by speech and residual based features. It is mainly due to the absence of fine structure information in estimated GFD signal. However, in fusion with speech signal based constant-Q cepstral coefficients (CQCC) features, the GFDMFCC feature provides an improvement of 10.30% with reference to conventional residual feature. This shows the usefulness of modelling GFD signals for detection of replay signals.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"45 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76136711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Iterative Eigensolver for Rank-Constrained Semidefinite Programming 秩约束半定规划的迭代特征解

2019 National Conference on Communications (NCC) Pub Date : 2019-02-01 DOI: 10.1109/NCC.2019.8732206

Rajat Sanyal, A. V. Singh, K. Chaudhury

引用次数: 0

Modified Generalised Quadrature Spatial Modulation 改进广义正交空间调制

2019 National Conference on Communications (NCC) Pub Date : 2019-02-01 DOI: 10.1109/NCC.2019.8732234

Kiran Gunde, K. Hari

引用次数: 5

Interpolated Compressed Sensing for Calibrationless Parallel MRI Reconstruction 无校准并行MRI重构的插值压缩感知

2019 National Conference on Communications (NCC) Pub Date : 2019-02-01 DOI: 10.1109/NCC.2019.8732192

S. Datta, B. Deka

引用次数: 4

Truthful Double Auction Based VM Allocation for Revenue-Energy Trade-Off in Cloud Data Centers 基于真实双拍卖的云数据中心收入-能源权衡虚拟机分配

2019 National Conference on Communications (NCC) Pub Date : 2019-02-01 DOI: 10.1109/NCC.2019.8732201

Yashwant Singh Patel, Animesh Nighojkar, R. Misra

{"title":"Truthful Double Auction Based VM Allocation for Revenue-Energy Trade-Off in Cloud Data Centers","authors":"Yashwant Singh Patel, Animesh Nighojkar, R. Misra","doi":"10.1109/NCC.2019.8732201","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732201","url":null,"abstract":"With the advances in virtualization technologies, cloud has emerged as a flexible and cost-effective service paradigm by provisioning on-demand VM resources to users via a pay-per-use business model. In cloud data centers, effective resource provisioning is required with the aim of minimizing energy consumption and maximizing cloud provider's revenue. However, the existing mechanisms have either focused on the optimization of energy, or the profit of cloud service provider (CSP) while incurring inefficient resource allocation. Thus to address these fundamental research challenges and to balance the trade-off between energy and revenue, we propose a Vickrey-Clarke-Groves (VCG) based truthful double auction mechanism (TDAM). In this paper, first, we have formulated a joint optimization problem and prove it NP-hard by reducing it to a multi-dimensional bin-packing problem. Then we design TDAM, a truthful double auction scheme and propose an efficient winning bid algorithm for VM allocation and a VCG based mechanism for calculating payment of each bid. Being a double auction, TDAM allows both the buyers (VMs) and the sellers (PMs) to submit their bids and asks respectively, and performs allocation based on the energy consumption, while upholding truthfulness, in order to avoid falsification of the submitted bid or ask values. Through theoretical analysis and extensive experiments we show that the TDAM makes a significant contribution while maintaining truthfulness, individual rationality, economic efficiency, and has polynomial time complexity.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"78 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83743459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Multimodal Fusion of Speech and Text using Semi-supervised LDA for Indexing Lecture Videos 基于半监督LDA的语音与文本多模态融合索引讲座视频

2019 National Conference on Communications (NCC) Pub Date : 2019-02-01 DOI: 10.1109/NCC.2019.8732253

M. Husain, S. Meena

{"title":"Multimodal Fusion of Speech and Text using Semi-supervised LDA for Indexing Lecture Videos","authors":"M. Husain, S. Meena","doi":"10.1109/NCC.2019.8732253","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732253","url":null,"abstract":"Lecture videos are the most popular learning materials due to their pedagogical benefits. However, accessing a topic or subtopic of interest requires manual examination of each frame of the video and it is more tedious when the volume and length of videos increases. The main problem thus becomes the efficient automatic segmentation and indexing of lecture videos that enables faster retrieval of specific and relevant content. In this paper, we present automatic indexing of lecture videos using topic hierarchies extracted from slide text and audio transcripts. Indexing videos based on slide text information is more accurate due to higher character recognition rates but, text content is very abstract and subjective. In contrast to slide text, audio transcripts provide comprehensive details about the topics, however retrieval results are imprecise due to higher WER. In order to address this problem, we propose a novel idea of fusing complementary strengths of slide text and audio transcript information using semi-supervised LDA algorithm. Further, we strive to improve learning of the model by utilizing words recognized from video slides as seed words and train the model to learn the distribution of video transcriptions around these seed words. We test the performance of proposed multimodal indexing scheme on 500 number of class room videos downloaded from Coursera, NPTEL and KLETU (KLE Technological University) classroom videos. The proposed multimodal fusion based scheme achieves an average percentage improvement of 44.49% F-Score compared with indexing using unimodal approaches.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"14 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75224462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

A SegNet Based Image Enhancement Technique for Air-Tissue Boundary Segmentation in Real-Time Magnetic Resonance Imaging Video 基于分段网的实时磁共振成像视频空气组织边界分割图像增强技术

2019 National Conference on Communications (NCC) Pub Date : 2019-02-01 DOI: 10.1109/NCC.2019.8732257

Renuka Mannem, Valliappan Ca, P. Ghosh

{"title":"A SegNet Based Image Enhancement Technique for Air-Tissue Boundary Segmentation in Real-Time Magnetic Resonance Imaging Video","authors":"Renuka Mannem, Valliappan Ca, P. Ghosh","doi":"10.1109/NCC.2019.8732257","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732257","url":null,"abstract":"In this paper, we propose a new technique for segmentation of the Air-Tissue Boundaries (ATBs) in the upper airway of the vocal tract in the midsagittal plane of the realtime Magnetic Resonance Imaging (rtMRI) videos. The proposed technique uses a segmentation using Fisher-discriminant measure (SFDM) scheme. The paper introduces an image enhancement technique using semantic segmentation in the preprocessing of the rtMRI frames before ATB prediction. We use a deep convolutional encoder-decoder architecture (SegNet) for semantic segmentation of the rtMRI images. The paper examines the significance of the preprocessing before ATB prediction by implementing the SFDM approach with different preprocessing techniques. Experiments with 5779 rtMRI video frames from four subjects demonstrate that using the semantic segmentation based image enhancement of rtMRI frames, the performance of the SFDM approach is improved compared to the other preprocessing approaches. Experiment results also show that the proposed approach yields 8.6% less error in ATB prediction compared with a semi-supervised grid based baseline segmentation approach.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"18 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72636830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Efficient Methods for Estimating Sinusoidal Frequencies Using Line Spectral Pairs 利用线谱对估计正弦频率的有效方法

2019 National Conference on Communications (NCC) Pub Date : 2019-02-01 DOI: 10.1109/NCC.2019.8732199

P. Vishnu, C. S. Ramalingam

{"title":"Efficient Methods for Estimating Sinusoidal Frequencies Using Line Spectral Pairs","authors":"P. Vishnu, C. S. Ramalingam","doi":"10.1109/NCC.2019.8732199","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732199","url":null,"abstract":"The maximum likelihood (ML) method of estimating the frequencies of $p$ sinusoids in the presence of AWGN is computationally very costly because of the dimensionality of the error surface; the advantage is that the ML method has the lowest threshold among all known practical estimators. We propose a low complexity method using Line Spectral Pairs (LSPs), where the LSPs are derived from an estimated $A$(z) obtained using Multiple Signal Classification (MUSIC) method. The proposed method evaluates the likelihood function at significantly fewer number of points–at most $^{5p}C_{p}$-for getting the estimates. Furthermore, no iterative finer search is required. Nevertheless, the proposed method's threshold is comparable to that of ML when tested using the well-known two-sinusoids example; similar performance was observed in the case of three sinusoids. Further improvements were observed when the beamformer function was used for detecting and removing outliers. For the two-sinusoid case, outlier removal resulted in a threshold that was lower than that of ML by as much as 9 dB (3π/2 case). We also present results for a direction of arrival (DOA) estimation example that results in the same threshold as that of ML.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"4 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78916135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Analysis of Beam Wander Effect of Flat-topped Multi-Gaussian Beam for FSO Communication Link FSO通信链路平顶多高斯波束的光束漂移效应分析

2019 National Conference on Communications (NCC) Pub Date : 2019-02-01 DOI: 10.1109/NCC.2019.8732189

Arka Mukherjee, Subrat Kar, V. Jain

引用次数: 0

Camera Zoom Detection and Classification Based on Application of Histogram Intersection and Kullback Leibler Divergence 基于直方图交集和Kullback Leibler散度的摄像机变焦检测与分类

2019 National Conference on Communications (NCC) Pub Date : 2019-02-01 DOI: 10.1109/NCC.2019.8732240

Pavan Sandula, M. Okade

引用次数: 1