{"title":"Comparative Study on Dimensionality Reduction in Large-Scale Image Retrieval","authors":"Bo Cheng, L. Zhuo, Jing Zhang","doi":"10.1109/ISM.2013.86","DOIUrl":"https://doi.org/10.1109/ISM.2013.86","url":null,"abstract":"Dimensionality reduction plays a significant role for the performance of large-scale image retrieval. In this paper, various dimensionality reduction methods are compared to validate their own performance in image retrieval. For this purpose, first, the Scale Invariant Feature Transform (SIFT) features and HSV (Hue, Saturation, Value) histogram are extracted as image features. Second, the Principal Component Analysis (PCA), Fisher Linear Discriminant Analysis (FLDA), Local Fisher Discriminant Analysis (LFDA), Isometric Mapping (ISOMAP), Locally Linear Embedding (LLE), and Locality Preserving Projections (LPP) are respectively applied to reduce the dimensions of SIFT feature descriptors and color information, which can be used to generate vocabulary trees. Finally, through setting the match weights of vocabulary trees, large-scale image retrieval scheme is implemented. By comparing multiple sets of experimental data from several platforms, it can be concluded that dimensionality reduction method of LLE and LPP can effectively reduce the computational cost of image features, and maintain the high retrieval performance as well.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"105 1","pages":"445-450"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73643137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Method for Identifying Exact Sensor Using Multiplicative Noise Component","authors":"B. Mahdian, S. Saic","doi":"10.1109/ISM.2013.46","DOIUrl":"https://doi.org/10.1109/ISM.2013.46","url":null,"abstract":"In this paper, we analyze and analytically describe the multiplicative deterministic noise component of imaging sensors in a novel way. Specifically, we show how to use the multiplicative nature of this component to derive a method enabling its estimation. Since this noise component is unique per sensor, consequently, the derived method is applied on digital image ballistics tasks in order to pinpoint the exact device that created a specific digital photo. Moreover, we enhance the method to be resistent to optical zoom and JPEG compression.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"59 1","pages":"241-247"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74374790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Network Coding for Streaming Video over P2P Networks","authors":"F. A. López-Fuentes, C. Cabrera-Medina","doi":"10.1109/ISM.2013.63","DOIUrl":"https://doi.org/10.1109/ISM.2013.63","url":null,"abstract":"In this contribution we simulate network coding and evaluate its benefits for streaming video over P2P networks. Network coding has emerged as a promise technique in the information theory field. This novel technique has shown several benefits in the communication networks related to throughput, security and resources optimization. In this work, we implement network coding for a multi-source P2P scenario. The video is encoded in the sources, while the intermediate nodes implement network coding before forwarding the encoded packets to the end nodes. The received packets are decoded in each receiving peer in order to recovery the original video. Our scheme is implemented under the H.264/MPEG-4 AVC compression standard and using the network simulator (NS-2). We evaluate our scheme in terms of overall throughput, packet loss and video quality. Results show that these parameters can be improved in P2P video streaming systems by using network coding.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"8 1","pages":"329-332"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75208079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Secure Steganography Technique Based on Bitplane Indexes","authors":"A. A. Abdulla, S. Jassim, H. Sellahewa","doi":"10.1109/ISM.2013.55","DOIUrl":"https://doi.org/10.1109/ISM.2013.55","url":null,"abstract":"This paper is concerned with secret hiding in multiple image bitplanes for increased security without undermining capacity. A secure steganographic algorithm based on bitplanes index manipulation is proposed. The index manipulation is confined to the first two Least Significant Bits of the cover image. The proposed algorithm has the property of un-detect ability with respect to stego quality and payload capacity. Experimental results demonstrate that the proposed technique is secure against statistical attacks such as pair of value (PoV), Weighted Stego steganalyser (WS), and Multi Bitplane Weighted Stego steganalyser (MLSB-WS).","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"69 1","pages":"287-291"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80287198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Impression Estimation of Video and Application to Video Creation","authors":"Kiyoshi Tokunaga, Takahiro Hayashi","doi":"10.1109/ISM.2013.25","DOIUrl":"https://doi.org/10.1109/ISM.2013.25","url":null,"abstract":"Adding BGM (background music) to a video is an important process in video creation because BGM determines the impression of the video. We model impression estimation of a video as mappping from computer-mesurable audio and visual features to impression degrees. As an application of impression estimation of a video, we propose OtoPittan, a system for recommending BGM for helping users to make impressive videos. OtoPittan regards the problem of selecting BGM from a music collection as a partial inverse problem of the impression estimation. That is, to an inputted video and desired impression, BGM which produces a good match to the desired impression when adding it to the inputted video is recommended. As implementation ways of impression estimation of a video, we use a static user model and a dynamic user model. The first model statically constructs a mapping function learnt from training data. The second model dynamically optimizes a mapping function through user interaction. Experimental results have shown that the static user model has high estimation accuracy and the dynamic user model can efficiently performs optimization without much user interaction.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"50 1","pages":"102-105"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86001954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Zhang, Hongjian Wang, Jean-Charles Créput, Julien Moreau, Y. Ruichek
{"title":"Cellular GPU Model for Structured Mesh Generation and Its Application to the Stereo-Matching Disparity Map","authors":"N. Zhang, Hongjian Wang, Jean-Charles Créput, Julien Moreau, Y. Ruichek","doi":"10.1109/ISM.2013.18","DOIUrl":"https://doi.org/10.1109/ISM.2013.18","url":null,"abstract":"This paper presents a cellular GPU model for structured mesh generation according to an input stereo-matching disparity map. Here, the disparity map stands for a density distribution that reflects the proximity of objects to the camera in 3D space. The meshing process consists in covering such data density distribution with a topological structured hexagonal grid that adapts itself and deforms according to the density values. The goal is to generate a compressed mesh where the nearest objects are provided with more details than objects which are far from the camera. The solution we propose is based on the Kohonen's Self-Organizing Map learning algorithm for the benefit of its ability to generate a topological map according to a probability distribution and its ability to be a natural massive parallel algorithm. We propose a GPU parallel model and its implantation of the SOM standard algorithm, and present experiments on a set of standard stereo-matching disparity map benchmarks.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"37 13 1","pages":"53-60"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78332862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Portable Multi-camera High Definition Video Capture Using Smartphones","authors":"Surendar Chandra, Patrick Chiu, Maribeth Back","doi":"10.1109/ISM.2013.74","DOIUrl":"https://doi.org/10.1109/ISM.2013.74","url":null,"abstract":"Real-time tele-immersion requires low latency and synchronized multi-camera capture. Prior high definition (HD) capture systems were bulky. We investigate the suitability of using flocks of smartphone cameras for tele-immersion. Smartphones integrate capture and streaming into a single portable package. However, they archive the captured video into a movie. Hence, we create a sequence of H.264 movies and stream them. Capture delay is reduced by minimizing the number of frames in each movie segment. However, fewer frames reduces compression efficiency. Also, smartphone video encoders do not sacrifice video quality to lower the compression latency or the stream size. On an iPhone 4S, our application that uses published APIs streams 1920×1080 videos at 16.5 fps with a delay of 712 ms between a real-life event and displaying an uncompressed bitmap of this event on a local laptop. Note that the bulky Cisco Tandberg required 300 ms delay. Stereoscopic video from two unsynchronized smartphones also showed minimal visual artifacts in an indoor setting.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"79 1","pages":"391-398"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82561564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interactive Event Recognition in Video","authors":"Mennan Güder, N. Cicekli","doi":"10.1109/ISM.2013.24","DOIUrl":"https://doi.org/10.1109/ISM.2013.24","url":null,"abstract":"In this paper, we propose a multi-modal decision-level fusion framework to recognize events in videos. The main parts of the proposed framework are ontology based event definition, structural video decomposition, temporal rule discovery and event classification. Various decision sources such as audio continuity, content similarity, and shot sequence characteristics together with visual video feature sets are combined with event descriptors during decision-level fusion. The method is considered to be interactive because of the user directed ontology connection and temporal rule extraction strategies. It enables users to integrate available ontologies such as Image Net and Word Net while defining new event types. Temporal rules are discovered by association rule mining. In the proposed approach, computationally I/O intensive requirements of the association rule mining is reduced by one-pass frequent item set extractor and the proposed rule definition strategy. Accuracy of the proposed methodology is evaluated by employing TRECVid 2007 high level feature detection data set by comparing the results with C4.5 decision tree, SVM classifiers and Multiple Correspondence Analysis.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"57 1","pages":"100-101"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75958495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Structural Segmentation of Music Based on Repeated Harmonies","authors":"W. B. Haas, A. Volk, F. Wiering","doi":"10.1109/ISM.2013.48","DOIUrl":"https://doi.org/10.1109/ISM.2013.48","url":null,"abstract":"In this paper we present a simple, yet powerful method for deriving the structural segmentation of a musical piece based on repetitions in chord sequences, called FORM. Repetition in harmony is a fundamental factor in constituting musical form. However, repeated pattern discovery in music still remains an open problem, and it has not been addressed before in chord sequences. FORM relies on a suffix tree based algorithm to find repeated patterns in symbolic chord sequences that are either provided by machine transcriptions or musical experts. This novel approach complements other segmentation methods, which generally use a self-distance matrix based on other musical features describing timbre, instrumentation, rhythm, or melody. We evaluate the segmentation quality of FORM on 649 popular songs, and show that FORM outperforms two baseline approaches. With FORM we explore new ways of exploiting musical repetition for structural segmentation, yielding a flexible and practical algorithm, and a better understanding of musical repetition.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"1 1","pages":"255-258"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88282567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting Musical Genre Borders for Multi-label Genre Classification","authors":"Hiroki Nakamura, Hung-Hsuan Huang, K. Kawagoe","doi":"10.1109/ISM.2013.108","DOIUrl":"https://doi.org/10.1109/ISM.2013.108","url":null,"abstract":"In this paper, we propose a novel method to detect music genre borders for the music genre classification. The music genre classification is getting more important because music is influenced by an increasing amount of different musical styles. A general approach to classify music genres is a single genre labeling that usually gives the meaning of inherent stylistic elements to a musical piece. However this gives ambiguity in case of a music piece having multiple genres. To solve the problem, we consider separating the multi-label classification task into the single-label genre classification task. We propose a novel method to detect music genre borders for multi-label genre classification. The proposed method can find borderlines of different genres in music. Moreover, it is strongly expected to realize the multi-label genre classification to apply the single-label genre classification to each detected music segment.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"10 23 1","pages":"532-533"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82878418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}