Proceedings of the ... IEEE International Symposium on Signal Processing and Information Technology. IEEE International Symposium on Signal Processing and Information Technology最新文献
{"title":"Dynamic Connection Strategies (DyConS) for spoken Malay speech recognition","authors":"N. Seman, N. Jamil, Raseeda Hamzah","doi":"10.1109/ISSPIT.2013.6781851","DOIUrl":"https://doi.org/10.1109/ISSPIT.2013.6781851","url":null,"abstract":"This paper presents the fusion of artificial intelligence (AI) learning algorithms that are genetic algorithms (GA) and conjugate gradient (CG) methods. Both methods are used to find the optimum weights for the hidden and output layers of feedforward artificial neural network (ANN) model. Each algorithm is presented in separate module and we proposed three different types of Dynamic Connection Strategies (DyConS) for combining both algorithms to improve the recognition performance of spoken Malay speech recognition. Two different GA techniques are used in this research: a mutated GA technique is proposed and compared with the standard GA technique. One hundred experiments with 5000 words are conducted using the proposed DyConS. Owing to previous facts, GA combined with ANN proved to attain certain advantages with sufficient recognition performance. Thus, from the results, it was observed that the performance of mutated GA algorithm when combined with CG is better than standard GA and CG models. Integrating the GA with feed-forward network improved mean square error (MSE) performance and with good connection strategy by this two stage training scheme, the recognition rate is increased up to 99%.","PeriodicalId":88960,"journal":{"name":"Proceedings of the ... IEEE International Symposium on Signal Processing and Information Technology. IEEE International Symposium on Signal Processing and Information Technology","volume":"7 1","pages":"000040-000045"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85960498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Speech emotion detection using time dependent self organizing maps","authors":"H. Balti, Adel Said Elmaghraby","doi":"10.1109/ISSPIT.2013.6781926","DOIUrl":"https://doi.org/10.1109/ISSPIT.2013.6781926","url":null,"abstract":"We propose a framework for speech emotion detection that maps acoustic features into high level descriptors that integrates time context. Our framework uses three different algorithms to integrate the temporal context. The first method is based on temporal averaging of the original features. The second algorithm derives the descriptors by clustering the data using self-organizing maps (SOMs) and computing the temporal average of the activity distribution of the original features on the map. The third algorithm uses multi resolution window analysis and SOMs to compute a 2-D map of emotions and derives high level trajectories representing the behavior of the original features on the map. Using a standard emotional database and K-nearest neighbors classifier, we show that the proposed framework is efficient for analysis, visualization and classification of emotions.","PeriodicalId":88960,"journal":{"name":"Proceedings of the ... IEEE International Symposium on Signal Processing and Information Technology. IEEE International Symposium on Signal Processing and Information Technology","volume":"20 1","pages":"000470-000478"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86035648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wujiahemaiti Simayi, Mayire Ibrayim, Dilmurat Tursun, A. Hamdulla
{"title":"Research on on-line Uyghur character recognition technology based on center distance feature","authors":"Wujiahemaiti Simayi, Mayire Ibrayim, Dilmurat Tursun, A. Hamdulla","doi":"10.1109/ISSPIT.2013.6781896","DOIUrl":"https://doi.org/10.1109/ISSPIT.2013.6781896","url":null,"abstract":"In this paper, the center distance feature (CDF) is presented as an efficient approach for on-line Uyghur handwritten character recognition. Based on early research for on-line Uyghur handwritten character recognition, a further research is conducted with center distance feature, abbreviated as CDF. This paper introduces the extraction of center distance feature and its three different methods such as CDF-2, CDF-4 and CDF-8 which have improved the average recognition accuracy respectively to 78.17%, 90.47% and 94.50% for the 32 isolated forms of Uyghur characters. 12800 samples from 400 different writers are participated into experiments. The system is trained using 70 percent of total samples and tested on the remained 30 percent.","PeriodicalId":88960,"journal":{"name":"Proceedings of the ... IEEE International Symposium on Signal Processing and Information Technology. IEEE International Symposium on Signal Processing and Information Technology","volume":"15 1 1","pages":"000293-000298"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82626230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Audio packet loss concealment using motion-compensated spectral extrapolation","authors":"S. Pedram, S. Vaseghi, Bahareh Langari","doi":"10.1109/ISSPIT.2013.6781920","DOIUrl":"https://doi.org/10.1109/ISSPIT.2013.6781920","url":null,"abstract":"This paper presents a packet loss concealment (PLC) method with applications to VoIP, wideband audio broadcast and streaming. The problem of modeling of time-varying frequency spectrum in the context of PLC is addressed and a novel solution is proposed for tracking and using the temporal motion of spectral flow. The proposed PLC utilizes a time-frequency motion (TFM) matrix representation of the audio signal where each frequency is tagged with a motion vector estimate. The spectral motion vectors are estimated by cross-correlating the movement of spectral energy within sub-bands across time frames. The missing packets are estimated in TFM domain and inverse transformed to the time-domain. In order to compare the proposed method with common approaches, objective performance evaluation of speech quality (PESQ), and subjective listening test in terms of MOS scores are conducted in a range of packet loss from 5% to 20%. The results demonstrate that the proposed algorithm improves performance compared to a number of benchmark methods including that of the ITU.","PeriodicalId":88960,"journal":{"name":"Proceedings of the ... IEEE International Symposium on Signal Processing and Information Technology. IEEE International Symposium on Signal Processing and Information Technology","volume":"23 1","pages":"000434-000439"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84590603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chrysovalantis Datsios, G. Keramidas, D. Serpanos, P. Soufrilas
{"title":"Performance and power trade-offs for cryptographic applications in embedded processors","authors":"Chrysovalantis Datsios, G. Keramidas, D. Serpanos, P. Soufrilas","doi":"10.1109/ISSPIT.2013.6781860","DOIUrl":"https://doi.org/10.1109/ISSPIT.2013.6781860","url":null,"abstract":"Cryptographic operations are resource-intensive in terms of computational power and energy consumption. Typical approaches towards secure embedded systems employ dedicated modules, such as ASICs, co-processors, and accelerators, to implement these functions and optimize these hardware modules for the adopted algorithms. In our work, we analyze performance and power trade-offs of typical cryptographic algorithms (DES, AES, and RSA) when executed in processing elements that constitute typical embedded processors. Our goal is to characterize and optimize, performance-wise and power-wise, the sources of inefficiency when the encryption/decryption operations are executed in general purpose embedded processors with different processing and caching capabilities. Our analysis focuses on three major parameters: the parallelism of the core (issue width and size of execution window), voltage and frequency switching in the core, and size of the last-level cache (LLC). Those parameters constitute the major power-consumption contributors in all modern embedded general purpose processors. Our results demonstrate that cryptographic operations can be performed efficiently, in terms of both performance and power consumption, for specific values of the analyzed parameters, indicating that reconfigurable approaches can dynamically optimize processor organization and ameliorate the reported performance and power figures in the context of general purpose embedded processors.","PeriodicalId":88960,"journal":{"name":"Proceedings of the ... IEEE International Symposium on Signal Processing and Information Technology. IEEE International Symposium on Signal Processing and Information Technology","volume":"64 1","pages":"000092-000095"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86159335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel computing techniques for performance enhancement of a cDNA microarray gridding algorithm","authors":"Stamos Katsigiannis, D. Maroulis","doi":"10.1109/ISSPIT.2013.6781922","DOIUrl":"https://doi.org/10.1109/ISSPIT.2013.6781922","url":null,"abstract":"cDNA microarrays are a powerful tool for studying gene expression levels. A challenging and complex task of microarray image analysis is the creation of a grid that matches the spots in the image. Proposed methods and tools usually require human intervention, leading to variations of the gene expression results. Furthermore, while automatic methods are available, they present high computational complexity. In this work, the authors present a performance enhancement via GPU computing techniques of an automatic gridding method, previously proposed by their research group. Complex steps of the algorithm were computed in parallel by utilizing the NVIDIA CUDA architecture that allows the use of NVIDIA GPUs for general purpose parallel computations. Experiments showed that the proposed approach achieves higher utilization of the available computational resources, leading to enhanced performance and significantly reduced computational time.","PeriodicalId":88960,"journal":{"name":"Proceedings of the ... IEEE International Symposium on Signal Processing and Information Technology. IEEE International Symposium on Signal Processing and Information Technology","volume":"66 1","pages":"000446-000451"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73489961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Perceptual dissimilarity metric: A full reference objective image quality measure to quantify the degradation of perceptual image quality","authors":"Sajib Saha, M. Tahtali, A. Lambert, M. Pickering","doi":"10.1109/ISSPIT.2013.6781902","DOIUrl":"https://doi.org/10.1109/ISSPIT.2013.6781902","url":null,"abstract":"This paper introduces a full reference objective image quality measure to quantify the degradation of perceptual image quality. Objective methods for assessing perceptual image quality are important for many image processing applications, such as monitoring and controlling image quality for quality control systems, benchmarking image processing systems and so on. The novel image quality metric proposed in this paper uses a relatively small number of pair-wise intensity comparisons to represent a patch as binary string, then compares corresponding patches using Hamming distances. It then calculates a dissimilarity value between images as an average of the Hamming distances computed between patches. The proposed metric is more consistent with human visual system and thus outperforms other existing and widely used metrics, namely the root mean square error (RMSE) and structural similarity index (SSIM). The computational cost of the proposed metric is also less compared to the state-of-the-art method.","PeriodicalId":88960,"journal":{"name":"Proceedings of the ... IEEE International Symposium on Signal Processing and Information Technology. IEEE International Symposium on Signal Processing and Information Technology","volume":"172 1","pages":"000327-000332"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77347635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seyed Sajad Mohseni Salehi Monfared, Elnaz Lashgari, Amir Akbarian Aghdam, B. Khalaj
{"title":"Method as a preprocessing stage for tracking sperms progressive motility","authors":"Seyed Sajad Mohseni Salehi Monfared, Elnaz Lashgari, Amir Akbarian Aghdam, B. Khalaj","doi":"10.1109/ISSPIT.2013.6781874","DOIUrl":"https://doi.org/10.1109/ISSPIT.2013.6781874","url":null,"abstract":"Methods of human semen assessment are quite wide ranging. In this paper, we use background subtraction methods in order to detect progressive sperms whose quality of movement strongly influence fertility. Robust Principal Component Analysis (RPCA) is a powerful algorithm which has been used recently for background subtraction purposes. Sperm tracking problem can also be defined as a background subtraction problem. In RPCA algorithm, data is represented by a low rank plus sparse matrix. In our approach, the foreground data is recovered through such matrix decomposition. We compare the RPCA approach with four other background subtraction methods in order to check accuracy of algorithm as a preprocessing stage in sperm tracking. Two basic background subtraction methods of approximate median and frame difference have been examined. Furthermore, another more recent method of mixture of Gaussian model and robust probabilistic matrix factorization have been used for comparison. As the results show, the RPCA approach is more robust and less sensitive to outliers in comparison with other background subtraction methods.","PeriodicalId":88960,"journal":{"name":"Proceedings of the ... IEEE International Symposium on Signal Processing and Information Technology. IEEE International Symposium on Signal Processing and Information Technology","volume":"1 1","pages":"000170-000174"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78074345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Speaker localization in conferencing systems employing phase features and wavelet transform","authors":"Rafal Samborski, M. Ziólko","doi":"10.1109/ISSPIT.2013.6781903","DOIUrl":"https://doi.org/10.1109/ISSPIT.2013.6781903","url":null,"abstract":"Some of existing conference system employ a distant microphone array instead of microphones dedicated for each user. This approach is much more convenient although suffers from much higher noise sensitivity. One of the possible solutions is employing beamforming techniques to focus on the user that is speaking at the moment. However, beamformer needs information about the direction of arrival (DOA) parameter which is usually provided by analysing the phase differences between signals. Effectiveness of such solution decrease dramatically when the environment becomes noisy. In this paper, a novel, robust meetings diarization system is described. The decision about which user is speaking at the moment is based not only on spacial features of signal (i.e., speaker's localization) but also on spectral features. The microphone array estimates speaker localization employing generalized cross-correlation with phase transform (GCC-PHAT). Additionally, the speaker recognition system which employs wavelet-Fourier transform (WFT) extracts spectral features of voice. Described solution is much more robust than the one basing on speaker recognition or speaker localization only. The experiments during meetings in regular meeting room show that it is less noise sensitive and the switching between speakers is several times faster.","PeriodicalId":88960,"journal":{"name":"Proceedings of the ... IEEE International Symposium on Signal Processing and Information Technology. IEEE International Symposium on Signal Processing and Information Technology","volume":"28 1","pages":"000333-000337"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82072956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jamal Amini, Abdoreza Sabzi Shahrebabaki, Navid Shokouhi, H. Sheikhzadeh, K. Raahemifar, M. Eslami
{"title":"Speech analysis/synthesis by Gaussian mixture approximation of the speech spectrum for voice conversion","authors":"Jamal Amini, Abdoreza Sabzi Shahrebabaki, Navid Shokouhi, H. Sheikhzadeh, K. Raahemifar, M. Eslami","doi":"10.1109/ISSPIT.2013.6781919","DOIUrl":"https://doi.org/10.1109/ISSPIT.2013.6781919","url":null,"abstract":"Voice conversion typically employs spectral features to convert a source voice to a target voice. In this paper, we propose a simple method of fitting the STRAIGHT spectrum with Gaussian mixture (GM) models for speech analysis/synthesis and spectral modification. The mean values of the Gaussians are pre-determined based on Mel-frequency spacing. The standard deviations are also adaptively adjusted using the constant-Q principle and the spectrum amplitudes. Finally, the weights of the Gaussians are determined by sampling the log-spectrum at Mel-frequencies. The proposed analysis/synthesis method (MFLS-GM) is employed for speech analysis/synthesis and voice conversion. Subjective evaluations employing MOS and ABX demonstrate superior performance of the voice conversion using the MFLS-GM compared to systems employing MFCC features. The computation cost of the proposed analysis/synthesis method is also much lower than those based on MFCC.","PeriodicalId":88960,"journal":{"name":"Proceedings of the ... IEEE International Symposium on Signal Processing and Information Technology. IEEE International Symposium on Signal Processing and Information Technology","volume":"12 1","pages":"000428-000433"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82976507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}