{"title":"Detection of double compression with the same bit rate in MPEG-2 videos","authors":"Zhensheng Huang, Fangjun Huang, Jiwu Huang","doi":"10.1109/ChinaSIP.2014.6889253","DOIUrl":"https://doi.org/10.1109/ChinaSIP.2014.6889253","url":null,"abstract":"The detection of double compression with the same bit rate in MPEG-2 is of great significance in the field of video forensics. However, when the singly compressed and the doubly compressed videos have the same bit rate, no detecting method has been reported yet. In this paper, we propose a new method that can detect double MPEG-2 compression with the same bit rate. Our method is based on the observation that the number of different coefficients between I frames of the singly and doubly compressed MPEG-2 videos is much larger than the number of different coefficients between the I frames of the corresponding doubly and triply compressed MPEG-2 videos. Via a new re-compressing strategy, the aforementioned characteristics of MPEG-2 video can be utilized to discriminate the singly compressed and doubly compressed MPEG-2 videos. Various experiments on different bit rates demonstrate the efficiency of our new method.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"56 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132511117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liang Zou, Xun Chen, A. Servati, P. Servati, M. McKeown
{"title":"A heart beat rate detection framework using multiple nanofiber sensor signals","authors":"Liang Zou, Xun Chen, A. Servati, P. Servati, M. McKeown","doi":"10.1109/CHINASIP.2014.6889240","DOIUrl":"https://doi.org/10.1109/CHINASIP.2014.6889240","url":null,"abstract":"Although electrocardiogram (ECG) is one standard way for monitoring heart beat rate, there are of great interests in exploring other types of biophysical signals. A novel type of nanofiber (NF) sensor signals, as a potential alternative choice to ECG signals for heart beat monitoring, are investigated in this paper. To get the heart beat signal, three nano sensors are deployed at the wrist. However, detecting the heart beat rate (HBR) directly from the raw data is challenging because the signals of interest are masked by different types of noise. To address this concern, a two-step framework based on ensemble empirical mode decomposition (EEMD) and multiset canonical correlation analysis (MCCA) is proposed to extract the interesting signals. Further, a specific HBR detection method is presented based on peak detection and peak filtering. We apply the proposed framework to the real data collected from one subject performing 8 tasks, and the results demonstrate its effectiveness and potential in real applications.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126638124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel signal reconstructing method for radar targets","authors":"Jia Duan, Lei Zhang, Yifeng Wu, M. Xing, Min Wu","doi":"10.1109/ChinaSIP.2014.6889226","DOIUrl":"https://doi.org/10.1109/ChinaSIP.2014.6889226","url":null,"abstract":"In this paper, a novel signal reconstructing method for radar targets is proposed based on the attributed scattering center model. By extracting the attributed parameters, the large amount of target data can be represented by small amounts of attributed parameters. In this way, the data amount has been compressed sharply, which releases the computer memory for storage. After extraction, a target discriminating method is presented by applying a CFAR threshold to the energy of extracted attributed scattering centers, by which, weak distributed scattering centers with relatively high energy in total can be discriminated from noise under low SNRs. Experimental results validate the effectiveness of the signal reconstructing capability of the proposal.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"68 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116131082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic cell phone recognition from speech recordings","authors":"Ling Zou, Ji-Chen Yang, Tangsen Huang","doi":"10.1109/ChinaSIP.2014.6889318","DOIUrl":"https://doi.org/10.1109/ChinaSIP.2014.6889318","url":null,"abstract":"Recording device recognition is an important research field of digital audio forensic. In this paper, we utilize Gaussian mixture model-universal background model (GMM-UBM) as the classifier to form a recording device recognition system. We examine the performance of Mel-frequency cepstral coefficients (MFCCs) and Power-normalized cepstral coefficients (PNCCs) to this problem. Experiments conducted on recordings come from 14 cell phones show that MFCCs are more effective than PNCCs in cell phone recognition. We find that the identification performance can be improved by stacking MFCCs and energy feature. We also investigate the effect of speaker mismatch and de-noising processing for acoustic feature to this problem. The highest identification accuracy achieved here is 97.71%.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124220818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved parsing with taxonomy of conjunctions","authors":"Dongchen Li, Xiantao Zhang, Xihong Wu","doi":"10.1109/ChinaSIP.2014.6889199","DOIUrl":"https://doi.org/10.1109/ChinaSIP.2014.6889199","url":null,"abstract":"Incorporating knowledge for training a parser has been shown to remedy the weaknesses of probabilistic context-free grammar. Previous parsing systems have exploited content words semantic resource and word-formation knowledge. However, they are limited in that they do not take into account conjunction category refinement, which stands out to be helpful in predicting the syntactic structure and syntactic label in Chinese. We define a conjunction taxonomy representing intrinsic syntactic constraints, and show that refined categories in the taxonomy for conjunctions contribute to improved parsing performance. The taxonomy is used to supervise the splitting of these refined tags, and the automatic hierarchical state-split approach is employ to compensate the limitation in the scope and refinement degree of the taxonomy. The experiments are carried out on Penn Chinese Treebank, which show that our method can improve parsing performance significantly.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121482454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chieh-Kai Kao, Tsung-Yau Huang, Homer H. Chen, Ja-Ling Wu
{"title":"Perceptully lossless video re-encoding for cloud transcoding","authors":"Chieh-Kai Kao, Tsung-Yau Huang, Homer H. Chen, Ja-Ling Wu","doi":"10.1109/ChinaSIP.2014.6889252","DOIUrl":"https://doi.org/10.1109/ChinaSIP.2014.6889252","url":null,"abstract":"In this paper, we present a perceptually lossless video re-encoding approach to cloud transcoding based on a just noticeable distortion (JND) model. The bitrate is minimized by adaptive step size and dynamic rounding offset adjustment. On the average, the proposed approach achieves 9.7% overall bitrate reduction compared to the H.264/AVC JM reference encoder under the same coding condition. The performance of the resulting cloud transcoding is further verified by a subjective test.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122570493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Popularity index through video semantic quality assessment","authors":"M. Shahid, S. Khatibi, Yared Tuemay","doi":"10.1109/ChinaSIP.2014.6889261","DOIUrl":"https://doi.org/10.1109/ChinaSIP.2014.6889261","url":null,"abstract":"Popularity of the streaming media content such as videos can be ascribed to the perceptual quality, to some extent, of the content. The traditional methods of audio/video quality assessment lack in provision of the input from higher cognitive of the human perception. Some studies have revealed that liking or disliking of a certain content can bias the human judgement towards video quality. In this paper, we have examined the impact of the use of semantic quality indicators namely audio content, audio quality, video content, and video quality in the assessment of quality of a video. Further, we have proposed a methodology to use these indicators for designing a prediction model for the popularity of streaming videos.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128297119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sangmin Kim, Dongkyu Lee, Chae-Bong Sohn, Seoung-Jun Oh
{"title":"Fast motion estimation for HEVC with adaptive search range decision on CPU and GPU","authors":"Sangmin Kim, Dongkyu Lee, Chae-Bong Sohn, Seoung-Jun Oh","doi":"10.1109/ChinaSIP.2014.6889262","DOIUrl":"https://doi.org/10.1109/ChinaSIP.2014.6889262","url":null,"abstract":"In this paper, we propose a fast Motion Estimation (ME) algorithm with Adaptive Search Range (ASR) decision to further accelerate the Graphics Processing Units (GPU)-based ME for High Efficiency Video Coding (HEVC). The proposed approach adaptively decides search ranges on the Central Processing Unit (CPU) and transfers them to the GPU. Then, the GPU performs ME process in parallel. The proposed approach solves the dependency problem in the Motion Vector Predictor (MVP) derivation stage by using only temporal Motion Vectors (MVs). The proposed algorithm yields the total encoding time reduction of 40.3% with negligible Rate Distortion (RD) loss of 1.2%. In terms of ME, the GPU-based ME with ASR decision provides the time reduction of 54.7% and 1446.8× speed-up on average compared to the GPU-based ME without ASR decision and the full-search ME in the reference model, respectively.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126944578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A multichannel widely linearwiener filter for binaural noise reduction in the short-time-fourier-transform domain","authors":"Liheng Zhao, Jingdong Chen, J. Benesty","doi":"10.1109/ChinaSIP.2014.6889237","DOIUrl":"https://doi.org/10.1109/ChinaSIP.2014.6889237","url":null,"abstract":"Binaural noise reduction is a very challenging problem since it requires not only to reduce noise, but also to recover the spatial information of the desired speech source so that the listener can localize this source from the binaural outputs. In this paper, we study the problem in the short-time-Fourier-transform (STFT) domain with the use of an array of microphones. Combining the multichannel microphone observations into a number of complex signals and merging the two (binaural) expected output channels into a complex signal, we reformulate the problem with the widely linear (WL) estimation technique. To efficiently achieve the optimal estimation, the complex signals are transformed into the frequency domain via the STFT. We then derive a WL Wiener filter based on the WL estimation theory and the mean-squared-error (MSE) criterion. This WL Wiener filter is shown to be able to exploit the noncircularity of the complex speech signals and the spatial information captured by the microphone array to achieve noise reduction while preserving the sound spatial information.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128195186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Cecchi, A. Primavera, M. Virgulti, F. Bettarelli, Junfeng Li, F. Piazza
{"title":"An efficient implementation of acoustic crosstalk cancellation for 3D audio rendering","authors":"S. Cecchi, A. Primavera, M. Virgulti, F. Bettarelli, Junfeng Li, F. Piazza","doi":"10.1109/ChinaSIP.2014.6889234","DOIUrl":"https://doi.org/10.1109/ChinaSIP.2014.6889234","url":null,"abstract":"The paper deals with the development of an efficient real time system for the reproduction of a spatialized audio field taking into account the listeners position. The system is composed of two parts: a sound rendering system based on a crosstalk canceller that is required in order to have a spatialized audio reproduction and a listener position tracking system in order to model the crosstalk canceller parameters. Then, an efficient implementation of a time domain crosstalk cancellation algorithm is presented considering an improved version of the recursive ambiophonics crosstalk elimination algorithm. A real time application is proposed introducing a Kinect control, capable to accurately track the listener position and changing the crosstalk parameters related to its position. Several results are presented comparing the proposed approach with the state of the art in order to confirm its validity.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128516215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}