{"title":"Modelling Glottal Flow Derivative Signal for Detection of Replay Speech Samples","authors":"Jagabandhu Mishra, D. Pati, S. Prasanna","doi":"10.1109/NCC.2019.8732249","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732249","url":null,"abstract":"It is a widely known fact that automatic speaker verification systems are quite vulnerable to replay speech. The present work deals with detecting replay speech by using the information available in glottal flow derivative (GFD) signal. In signal processing terms, the speech signal can be represented as the response of a vocal-tract system with excited by a excitation source in the form of glottal flow. The effect of record and replay devices distorted the spectral characteristics of the naturally uttered speech sample, resulting distortion in corresponding GFD signals. In this work the GFD signals are parameterized by using standard mel filters and Gaussian mixtures models are made for detection. Although various methods are available, by correlation analysis it is observed that in the context of the present work the dynamic programming phase slope algorithm (DYPSA) method is relatively more effective in estimating the GFD signals. The experimental studies are made on ASVSpoof2017 database. The proposed glottal flow derivative mel frequency cepstral coefficients (GFDMFCC) feature provides 20.53% equal error rate (EER). This performance is comparatively poor than by speech and residual based features. It is mainly due to the absence of fine structure information in estimated GFD signal. However, in fusion with speech signal based constant-Q cepstral coefficients (CQCC) features, the GFDMFCC feature provides an improvement of 10.30% with reference to conventional residual feature. This shows the usefulness of modelling GFD signals for detection of replay signals.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"45 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76136711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Iterative Eigensolver for Rank-Constrained Semidefinite Programming","authors":"Rajat Sanyal, A. V. Singh, K. Chaudhury","doi":"10.1109/NCC.2019.8732206","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732206","url":null,"abstract":"Rank-constrained semidefinite programming (SDP) arises naturally in various applications such as max-cut, angular (phase) synchronization, and rigid registration. Based on the alternating direction method of multipliers, we develop an iterative solver for this nonconvex form of SDP, where the dominant cost per iteration is the partial eigendecomposition of a symmetric matrix. We prove that if the iterates converge, then they do so to a KKT point of the SDP. In the context of rigid registration, we perform several numerical experiments to study the convergence behavior of the solver and its registration accuracy. As an application, we use the solver for wireless sensor network localization from range measurements. The resulting algorithm is shown to be competitive with existing optimization methods for sensor localization in terms of speed and accuracy.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"154 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73724941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modified Generalised Quadrature Spatial Modulation","authors":"Kiran Gunde, K. Hari","doi":"10.1109/NCC.2019.8732234","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732234","url":null,"abstract":"In this paper, we propose a Modified Generalised Quadrature Spatial Modulation (mGQSM) scheme with multiple RF chains. The proposed scheme, compared to GQSM, proposes a novel codebook design which provides one extra bit per channel use (bpcu) spectral efficiency with the constraint of ${log_{2}begin{pmatrix} N_{t} N_{r_{f}} end{pmatrix}}geq0.5$, where $N_{t}$ denotes number of transmit antennas, and $N_{r_{f}}$ denotes number of RF chains, $1leq N_{rf}leq lfloorfrac{N_{t}}{2}rfloor$. Using the ML detection algorithm, we study the performance of mGQSM with and without imperfect channel state information, via numerical simulations. We compute the computational complexity of ML-decoding in terms of real valued multiplications and introduce a variant of mGQSM called Reduced Codebook mGQSM (RC-mGQSM) to reduce the complexity but resulting in a decrease in spectral efficiency.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"55 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73777538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interpolated Compressed Sensing for Calibrationless Parallel MRI Reconstruction","authors":"S. Datta, B. Deka","doi":"10.1109/NCC.2019.8732192","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732192","url":null,"abstract":"Parallel magnetic resonance imaging (pMRI) in clinical study are commonly acquired in multiple slices; parallely along different channels. Since, MRI traditionally suffers from slow data acquisition, reconstruction of images in clinical pMRI would be further slower. Compressed sensing MRI (CS-MRI) has successfully demonstrated its potential in reducing the scan time of pMRI by manifolds. Due to high correlation of adjacent slices in multislice sequence, interpolation of multi-slice data may be carried out to support non-uniform undersampling based CS reconstruction of slices in k-space. Exploiting intra/inter slice as well as multichannel data redundancy of multi-slice pMRI, it is possible to accelerate the scan time further. These correlations can be well modeled by introducing multidimensional wavelet forest sparsity and joint total variation regularization during the CS reconstruction. To validate our claim, a number of experiments are carried out with real pMRI datasets and results are compared with the state-of-the-art.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"55 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83757712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Truthful Double Auction Based VM Allocation for Revenue-Energy Trade-Off in Cloud Data Centers","authors":"Yashwant Singh Patel, Animesh Nighojkar, R. Misra","doi":"10.1109/NCC.2019.8732201","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732201","url":null,"abstract":"With the advances in virtualization technologies, cloud has emerged as a flexible and cost-effective service paradigm by provisioning on-demand VM resources to users via a pay-per-use business model. In cloud data centers, effective resource provisioning is required with the aim of minimizing energy consumption and maximizing cloud provider's revenue. However, the existing mechanisms have either focused on the optimization of energy, or the profit of cloud service provider (CSP) while incurring inefficient resource allocation. Thus to address these fundamental research challenges and to balance the trade-off between energy and revenue, we propose a Vickrey-Clarke-Groves (VCG) based truthful double auction mechanism (TDAM). In this paper, first, we have formulated a joint optimization problem and prove it NP-hard by reducing it to a multi-dimensional bin-packing problem. Then we design TDAM, a truthful double auction scheme and propose an efficient winning bid algorithm for VM allocation and a VCG based mechanism for calculating payment of each bid. Being a double auction, TDAM allows both the buyers (VMs) and the sellers (PMs) to submit their bids and asks respectively, and performs allocation based on the energy consumption, while upholding truthfulness, in order to avoid falsification of the submitted bid or ask values. Through theoretical analysis and extensive experiments we show that the TDAM makes a significant contribution while maintaining truthfulness, individual rationality, economic efficiency, and has polynomial time complexity.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"78 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83743459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimodal Fusion of Speech and Text using Semi-supervised LDA for Indexing Lecture Videos","authors":"M. Husain, S. Meena","doi":"10.1109/NCC.2019.8732253","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732253","url":null,"abstract":"Lecture videos are the most popular learning materials due to their pedagogical benefits. However, accessing a topic or subtopic of interest requires manual examination of each frame of the video and it is more tedious when the volume and length of videos increases. The main problem thus becomes the efficient automatic segmentation and indexing of lecture videos that enables faster retrieval of specific and relevant content. In this paper, we present automatic indexing of lecture videos using topic hierarchies extracted from slide text and audio transcripts. Indexing videos based on slide text information is more accurate due to higher character recognition rates but, text content is very abstract and subjective. In contrast to slide text, audio transcripts provide comprehensive details about the topics, however retrieval results are imprecise due to higher WER. In order to address this problem, we propose a novel idea of fusing complementary strengths of slide text and audio transcript information using semi-supervised LDA algorithm. Further, we strive to improve learning of the model by utilizing words recognized from video slides as seed words and train the model to learn the distribution of video transcriptions around these seed words. We test the performance of proposed multimodal indexing scheme on 500 number of class room videos downloaded from Coursera, NPTEL and KLETU (KLE Technological University) classroom videos. The proposed multimodal fusion based scheme achieves an average percentage improvement of 44.49% F-Score compared with indexing using unimodal approaches.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"14 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75224462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A SegNet Based Image Enhancement Technique for Air-Tissue Boundary Segmentation in Real-Time Magnetic Resonance Imaging Video","authors":"Renuka Mannem, Valliappan Ca, P. Ghosh","doi":"10.1109/NCC.2019.8732257","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732257","url":null,"abstract":"In this paper, we propose a new technique for segmentation of the Air-Tissue Boundaries (ATBs) in the upper airway of the vocal tract in the midsagittal plane of the realtime Magnetic Resonance Imaging (rtMRI) videos. The proposed technique uses a segmentation using Fisher-discriminant measure (SFDM) scheme. The paper introduces an image enhancement technique using semantic segmentation in the preprocessing of the rtMRI frames before ATB prediction. We use a deep convolutional encoder-decoder architecture (SegNet) for semantic segmentation of the rtMRI images. The paper examines the significance of the preprocessing before ATB prediction by implementing the SFDM approach with different preprocessing techniques. Experiments with 5779 rtMRI video frames from four subjects demonstrate that using the semantic segmentation based image enhancement of rtMRI frames, the performance of the SFDM approach is improved compared to the other preprocessing approaches. Experiment results also show that the proposed approach yields 8.6% less error in ATB prediction compared with a semi-supervised grid based baseline segmentation approach.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"18 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72636830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Methods for Estimating Sinusoidal Frequencies Using Line Spectral Pairs","authors":"P. Vishnu, C. S. Ramalingam","doi":"10.1109/NCC.2019.8732199","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732199","url":null,"abstract":"The maximum likelihood (ML) method of estimating the frequencies of $p$ sinusoids in the presence of AWGN is computationally very costly because of the dimensionality of the error surface; the advantage is that the ML method has the lowest threshold among all known practical estimators. We propose a low complexity method using Line Spectral Pairs (LSPs), where the LSPs are derived from an estimated $A$(z) obtained using Multiple Signal Classification (MUSIC) method. The proposed method evaluates the likelihood function at significantly fewer number of points–at most $^{5p}C_{p}$-for getting the estimates. Furthermore, no iterative finer search is required. Nevertheless, the proposed method's threshold is comparable to that of ML when tested using the well-known two-sinusoids example; similar performance was observed in the case of three sinusoids. Further improvements were observed when the beamformer function was used for detecting and removing outliers. For the two-sinusoid case, outlier removal resulted in a threshold that was lower than that of ML by as much as 9 dB (3π/2 case). We also present results for a direction of arrival (DOA) estimation example that results in the same threshold as that of ML.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"4 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78916135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Beam Wander Effect of Flat-topped Multi-Gaussian Beam for FSO Communication Link","authors":"Arka Mukherjee, Subrat Kar, V. Jain","doi":"10.1109/NCC.2019.8732189","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732189","url":null,"abstract":"Atmospheric turbulence causes severe impairment of FSO communication link. By using the earlier model for the Gaussian beam, we analyze the beam wander effect for the flat-topped multi-Gaussian beam. The link availability decreases drastically in high turbulence regime for a Gaussian beam due to turbulence induced beam wander. In this paper, we model each turbulent eddy as a thin dielectric lens with Gaussian shaped refractive index profile and assume there are several sheets of eddies throughout the propagation path. We consider uniformly distributed eddy positions in a laminar sheet with Chi-Square distributed eddy sizes. We graphically demonstrate beam wander characteristics for different beam sizes and orders of the flat-topped multi-Gaussian beam in all three turbulence regimes characterized by different refractive index structure parameter values. Our results show that the flat-topped beam has a limited advantage in weak and moderate turbulence regimes. But it has a significant advantage in high turbulence regime to mitigate link outage due to beam wander.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"1 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89421817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Camera Zoom Detection and Classification Based on Application of Histogram Intersection and Kullback Leibler Divergence","authors":"Pavan Sandula, M. Okade","doi":"10.1109/NCC.2019.8732240","DOIUrl":"https://doi.org/10.1109/NCC.2019.8732240","url":null,"abstract":"This paper presents a novel compressed domain technique for detecting zooming camera in video sequences and its further classification into zoom-in camera and zoom-out camera. The inter-frame block motion vector field serves as the input to the proposed system which is partitioned into four representative quadrants for analysis purposes. The histograms of these four quadrants are analyzed utilizing histogram intersection feature for zoom motion detection while the cumulative histogram of these four quadrants are analyzed utilizing Kullback-Leibler divergence feature for zoom motion classification purposes. Experimental validation carried out utilizing block motion vectors extracted using Exhaustive Search Motion Estimation algorithm as well as H.264 decoded block motion vectors demonstrate superior performance in comparison to existing techniques.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"18 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87186557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}