{"title":"Compression for the feature points with binary descriptors","authors":"Jian-Jiun Ding, Szu-Wei Fu, Ching-Wen Hsiao, Pin-Xuan Lee, Yen-Chun Chen","doi":"10.1109/ICDSP.2014.6900746","DOIUrl":"https://doi.org/10.1109/ICDSP.2014.6900746","url":null,"abstract":"Feature points, such as SIFT, BRISK, ORB, and FREAK, are effective for template matching, pattern recognition, and object alignment. However, since an image usually has 200-4000 feature points and the size of each descriptor is 512 or 256, an efficient way for encoding the descriptors and locations of feature points is required. In this paper, we propose an algorithm to encode the descriptors, locations, and angles of BRISK, ORB, and FREAK points efficiently. We apply both the global and local statistical characteristics and apply different reference points for the cases where the previous bit is 1 or 0. Moreover, the facts that feature points do not uniformly distribute and that two feature points with a short distance always have a small angle difference are also applied for compression. Simulations show that the proposed algorithm can much reduce the data sizes required for encoding feature points.","PeriodicalId":301856,"journal":{"name":"2014 19th International Conference on Digital Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130834233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Topographical segmentation: A new tool to optimally define temporal region-of-interests of significant difference in ERPs","authors":"Li Hu, Jiasi Shen, Zhiguo Zhang","doi":"10.1109/ICDSP.2014.6900772","DOIUrl":"https://doi.org/10.1109/ICDSP.2014.6900772","url":null,"abstract":"The statistical identification of temporal region-of-interests (ROIs) of the significant difference in event-related potentials (ERPs) was popularly achieved using the cluster-based approach, in which the clustering was achieved based on the temporal adjacency of statistical significance if data from single-electrode were tested, or based on the spatial and temporal adjacency of statistical significance if data from multi-electrodes were tested. However, this cluster-based approach would be problematic if the significant differences were strong and sustained in time, but varied greatly in space. In other words, neural generators, which contributed to the detected significant differences, changed markedly within the explored temporal-cluster. To solve this problem, we implemented a statistical approach based on topographical segmentation analysis, which did not only make use of the temporal adjacency of significance, but also utilized the scalp distribution of statistical difference. We applied this technique to assess the significant difference of SEPs between deviant and standard conditions, and we observed that temporal ROIs, captured distinct spatial distributions of statistical difference, could be correctly identified using the topographical segmentation analysis be means of quasi-stable scalp distribution.","PeriodicalId":301856,"journal":{"name":"2014 19th International Conference on Digital Signal Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114712352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Probability distribution estimation of music signals in time and frequency domains","authors":"Vaibhav Arora, Ravi Kumar","doi":"10.1109/ICDSP.2014.6900696","DOIUrl":"https://doi.org/10.1109/ICDSP.2014.6900696","url":null,"abstract":"This paper attempts to estimate the probability distribution of music signals. A number of music signals belonging to different genres of music have been analyzed. Four well known speech distributions viz. Gaussian, Generalized Gamma, Laplacian and Cauchy have been tested as hypotheses. The distribution estimation has been carried out in time and Discrete-Cosine-Transform (DCT) domains. It was observed that skewed Laplacian distribution describes the music samples most accurately with the peakedness of the distribution being correlated with the genre of music. Although Cauchy distribution along with Laplacian has been a good fit for most of the data, it is analytically shown in this work that Laplacian distribution is a better choice for modeling music signals.","PeriodicalId":301856,"journal":{"name":"2014 19th International Conference on Digital Signal Processing","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124692607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CDMA-FMT: A novel multiple access scheme for 5G wireless communications","authors":"Zongjie Wang, Shuju Fan, Yun Rui","doi":"10.1109/ICDSP.2014.6900798","DOIUrl":"https://doi.org/10.1109/ICDSP.2014.6900798","url":null,"abstract":"This paper presents the an efficient multi-carrier system architecture of transmitter and receiver implementation based Filterbank MultiTone (FMT) system, which is a traditional filtered bank multi-branch multicarrier concept. The FMT approach exhibits some attractive features which are important for the high fragmented spectrum scenarios. In addition, it could be a unified framework to incorporate different base band signal processing technology. In particularly, Code Division Multiple Access-FMT (CDMA-FMT) is proposed and simulated to present the potential multiple access scheme to be a candidate of 5G waveform.","PeriodicalId":301856,"journal":{"name":"2014 19th International Conference on Digital Signal Processing","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124728223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A face hallucination method via HR-LLE coefficients constraint","authors":"Zhenli Wei, Xiaoguang Li, L. Zhuo","doi":"10.1109/ICDSP.2014.6900816","DOIUrl":"https://doi.org/10.1109/ICDSP.2014.6900816","url":null,"abstract":"In most of the existing LLE (Local Linear Embedding) based face hallucination methods, a LR (Low Resolution) face image is usually represented as a linear combination of training samples. The combination coefficients of LR image are then directly used to estimate the HR (High Resolution) image. However, due to the one-to-many mapping from LR to HR face space, the LR-LLE coefficients are not as the same as the corresponding HR-LLE coefficients. Therefore, the estimated HR faces are different from the ground truth. A novel face super-resolution(SR, also named face hallucination) method is proposed in this paper, in which a HR-LLE coefficients constraint is introduced to predict the coefficients of HR image. It can effectively reduce the error of the estimated HR-LLE coefficients. Then, we develop a novel method to perform face hallucination based on both the global and local features. Experimental results show that the proposed method provides improved performance over the compared methods in terms of both the subjective and objective quality.","PeriodicalId":301856,"journal":{"name":"2014 19th International Conference on Digital Signal Processing","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128783516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. G. Kanas, I. Mporas, H. Benz, K. Sgarbas, Anastasios Bezerianos, N. Crone
{"title":"Real-time voice activity detection for ECoG-based speech brain machine interfaces","authors":"V. G. Kanas, I. Mporas, H. Benz, K. Sgarbas, Anastasios Bezerianos, N. Crone","doi":"10.1109/ICDSP.2014.6900790","DOIUrl":"https://doi.org/10.1109/ICDSP.2014.6900790","url":null,"abstract":"In this article, we investigated the performance of a real-time voice activity detection module exploiting different time-frequency methods for extracting signal features in a subject with implanted electrocorticographic (ECoG) electrodes. We used ECoG signals recorded while the subject performed a syllable repetition task. The voice activity detection module used, as input, ECoG data streams, on which it performed feature extraction and classification. With this approach we were able to detect voice activity (speech onset and offset) from ECoG signals with high accuracy. The results demonstrate that different time-frequency representations carried complementary information about voice activity, with the S-transform achieving 92% accuracy using the 86 best features and support vector machines as the classifier. The proposed real-time voice activity detector may be used as a part of an automated natural speech BMI system for rehabilitating individuals with communication deficits.","PeriodicalId":301856,"journal":{"name":"2014 19th International Conference on Digital Signal Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125578302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An alternating ℓp — ℓ2 projections algorithm (ALPA) for speech modeling using sparsity constraints","authors":"A. Adiga, C. Seelamantula","doi":"10.1109/ICDSP.2014.6900673","DOIUrl":"https://doi.org/10.1109/ICDSP.2014.6900673","url":null,"abstract":"We address the problem of separating a speech signal into its excitation and vocal-tract filter components, which falls within the framework of blind deconvolution. Typically, the excitation in case of voiced speech is assumed to be sparse and the vocal-tract filter stable. We develop an alternating ℓp - ℓ2 projections algorithm (ALPA) to perform deconvolution taking into account these constraints. The algorithm is iterative, and alternates between two solution spaces. The initialization is based on the standard linear prediction decomposition of a speech signal into an autoregressive filter and prediction residue. In every iteration, a sparse excitation is estimated by optimizing an ℓp-norm-based cost and the vocal-tract filter is derived as a solution to a standard least-squares minimization problem. We validate the algorithm on voiced segments of natural speech signals and show applications to epoch estimation. We also present comparisons with state-of-the-art techniques and show that ALPA gives a sparser impulse-like excitation, where the impulses directly denote the epochs or instants of significant excitation.","PeriodicalId":301856,"journal":{"name":"2014 19th International Conference on Digital Signal Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123289215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Projection based algorithm for detecting exudates in color fundus images","authors":"C. Eswaran, M. D. Saleh, J. Abdullah","doi":"10.1109/ICDSP.2014.6900707","DOIUrl":"https://doi.org/10.1109/ICDSP.2014.6900707","url":null,"abstract":"The detection and analysis of spot lesions associated with the retinal diseases, such as exudates, microaneurysms, and hemorrhages, play an important role in the screening of retinal diseases. This paper presents an algorithm for segmentation of automated exudates from color fundus images. The proposed algorithm comprises two major stages, namely, pre-processing and segmentation. A novel pre-processing method is employed for background removal through contrast enhancement and noise removal. In the second stage, the pre-processed image is sliced horizontally and vertically into a number of slices and then the corresponding projection values are obtained in order to select an appropriate threshold value for each of the image slices. Finally, optic disc is removed to facilitate the correct identification of exudates and to decrease the false positive cases. DIARETDB1 database is used to measure the accuracy of the proposed method. Based on the experiments which are conducted on pixel basis, it is found that the proposed algorithm achieves better results compared to known algorithms. With the proposed algorithm, average values of 71.2%, 72.77%, 99.98%, 97.72%, 99.74%, and 83.28% are obtained in terms of overlap, sensitivity, specificity, PPV, accuracy, and kappa coefficient respectively.","PeriodicalId":301856,"journal":{"name":"2014 19th International Conference on Digital Signal Processing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122305655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Audio surveillance under noisy conditions using time-frequency image feature","authors":"R. Sharan, T. Moir","doi":"10.1109/ICDSP.2014.6900815","DOIUrl":"https://doi.org/10.1109/ICDSP.2014.6900815","url":null,"abstract":"In this paper, we use the novel method of using features extracted from the time-frequency image representation of a sound signal in an audio surveillance application. In particular, we investigate two image representations: linear grayscale and log grayscale. We first divide a sound signal into smaller frames and apply a windowing function. The absolute value of the Discrete Fourier Transform of each frame is then computed and normalized to get the intensity values for the linear grayscale image. The generation of the log grayscale image takes a similar approach but we take log power of the values before data normalization. Each image is then divided into blocks and central moments are computed in each block. We carry out experimentation under different noise conditions and varying signal-to-noise ratio using support vector machines for classification. Based on the classification accuracy, the linear grayscale image approach is found to be more noise robust than the log grayscale image approach. It was also found to perform better than using mel-frequency cepstral coefficients as features which is a common baseline feature in most sound recognition applications.","PeriodicalId":301856,"journal":{"name":"2014 19th International Conference on Digital Signal Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126059098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hausdorff symmetry operator towards retinal blood vessel segmentation","authors":"Rashmi Panda, N. Puhan, G. Panda","doi":"10.1109/ICDSP.2014.6900737","DOIUrl":"https://doi.org/10.1109/ICDSP.2014.6900737","url":null,"abstract":"Automated retinal blood vessel segmentation is a fundamental component in computer aided retinal disease screening system and diagnosis. This paper presents a novel method of Hausdorff symmetry operator for automatic centerline pixel selection towards retinal blood vessel segmentation. Centerline pixels are determined by considering geometrical symmetry (distance and orientation) and Hausdorff distance based point set matching at the centerline pixel. This is performed in subpixel resolution to achieve higher accuracy. Then K-means clustering is applied to remove false centerline pixels. The selected centerline pixels act as seed points to be used in region growing to segment the retinal blood vessels. Our proposed method is evaluated on DRIVE and STARE databases. The experimental results demonstrate that the performance of the proposed method is comparable with state-of-the-art techniques. The advantages of the proposed method include its ability to correctly segment thin blood vessels, vessels containing light reflex, and disc area is not misclassified as vessels.","PeriodicalId":301856,"journal":{"name":"2014 19th International Conference on Digital Signal Processing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121037014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}