{"title":"Speech Enhancement in Noisy Environments for Video Retrieval","authors":"Huiyu Zhou, A. Sadka, Richard M. Jiang","doi":"10.1109/WIAMIS.2008.38","DOIUrl":"https://doi.org/10.1109/WIAMIS.2008.38","url":null,"abstract":"In this paper, we propose a novel spectral subtraction approach for speech enhancement via maximum likelihood estimate (MLE). This scheme attempts to simulate the probability distribution of useful speech signals and hence maximally reduce the noise. To evaluate the quality of speech enhancement, we extract cepstral features from the enhanced signals, and then apply them to a dynamic time warping framework for similarity check between the clean and filtered signals. The performance of the proposed enhancement method is compared to that of other classical techniques. The entire framework does not assume any model for the background noise and does not require any noise training data.","PeriodicalId":325635,"journal":{"name":"2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services","volume":"149 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121191129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Papadopoulos, K. Chandramouli, V. Mezaris, Y. Kompatsiaris, E. Izquierdo, M. Strintzis
{"title":"A Comparative Study of Classification Techniques for Knowledge-Assisted Image Analysis","authors":"G. Papadopoulos, K. Chandramouli, V. Mezaris, Y. Kompatsiaris, E. Izquierdo, M. Strintzis","doi":"10.1109/WIAMIS.2008.36","DOIUrl":"https://doi.org/10.1109/WIAMIS.2008.36","url":null,"abstract":"In this paper, four individual approaches to region classification for knowledge-assisted semantic image analysis are presented and comparatively evaluated. All of the examined approaches realize knowledge-assisted analysis via implicit knowledge acquisition, i.e. are based on machine learning techniques such as support vector machines (SVMs), self organizing maps (SOMs), genetic algorithm (GA)and particle swarm optimization (PSO). Under all examined approaches, each image is initially segmented and suitable low-level descriptors are extracted for every resulting segment. Then, each of the aforementioned classifiers is applied to associate every region with a predefined high-level semantic concept. An appropriate evaluation framework has been employed for the comparative evaluation of the above algorithms under varying experimental conditions.","PeriodicalId":325635,"journal":{"name":"2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122406198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised Feature Selection for Detection Using Mutual Information Thresholding","authors":"Ciarán Ó Conaire, N. O’Connor","doi":"10.1109/WIAMIS.2008.10","DOIUrl":"https://doi.org/10.1109/WIAMIS.2008.10","url":null,"abstract":"This paper proposes a method for unsupervised selection of features for detecting important events in a surveillance context. While traditional feature selection requires manually annotated ground truth to choose the best features, we examine the possibility of exploiting the redundancy between a pair of independent data sources for selecting good detection features. Building on our prior work on mutual information thresholding, we show that strong agreement between data sources indicates strong detection performance. Experimental tests, combining visual and audio data, show that the best performing features can be automatically selected by taking advantage of the common information shared by the sensors.","PeriodicalId":325635,"journal":{"name":"2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131325731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benoit Baccot, V. Charvillat, R. Grigoras, Cezar Plesca
{"title":"Visual Attention Metadata from Pictures Browsing","authors":"Benoit Baccot, V. Charvillat, R. Grigoras, Cezar Plesca","doi":"10.1109/WIAMIS.2008.57","DOIUrl":"https://doi.org/10.1109/WIAMIS.2008.57","url":null,"abstract":"Image browsing has become an indispensable feature for today's mobile devices. To overcome their limited display size, information systems may benefit from the abundant clickstream provided by mobile users. This implicit feedback turns out to be very informative providing hints on both the visual image content and the relevance of the query results when searching for images. Building on previous works on user attention models, we propose a practical, yet generic platform for web usage tracking. Implicit feedback from the clickstream makes it possible to generate visual attention metadata. These metadata are user interest maps (UIMs) in which regions of interest (ROIs) become highlighted. These UIMs are used to establish and reinforce keyword-to-image relations. A tentative application for semantic annotation is also presented.","PeriodicalId":325635,"journal":{"name":"2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129583667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Image Retrieval Using Automatic Image Sorting and Semi-automatic Generation of Image Semantics","authors":"K. U. Barthel","doi":"10.1109/WIAMIS.2008.56","DOIUrl":"https://doi.org/10.1109/WIAMIS.2008.56","url":null,"abstract":"In this paper we propose a new image search system using keyword annotations and low-level visual meta-data to generate inter-image relationships. Unlike other approaches the new system does not try to learn the degree of confidence between images and associated keywords. We rather propose to model the degree of similarity between images by building up a network of linked images. The weights of the inter-image links are learned from the userspsila interaction with the system only. For each image search a set of candidate images is selected from a visually sorted arrangement of result images. This candidate set is used to refine the result by filtering out non-suiting images from a larger set of further result images. Semantic inter-image relation-ships of images can be modeled by collecting the candidate sets from many searches. Our system improves Internet image search significantly.","PeriodicalId":325635,"journal":{"name":"2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services","volume":"217 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121944708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Traffic Sign Recognition Based on Pictogram Contours","authors":"Carlos Filipe Paulo, P. Correia","doi":"10.1109/WIAMIS.2008.31","DOIUrl":"https://doi.org/10.1109/WIAMIS.2008.31","url":null,"abstract":"This paper addresses the problem of automatic recognition of traffic signs from images captured while driving, to provide a driver aid. Traffic signs can be detected based on their color and classified, also according to their shape, into danger, information, obligation or prohibition classes - special cases are the stop, yield and wrong way signs. Then, traffic signs can be recognized based on their pictograms, as each pictogram is unique within a given sign class. The proposed recognition algorithm analyses pictogram outer contours. When pictograms are composed of several disconnected regions, a snake is used to create a single pictogram outer contour. Matching of signs against the database is done using the curvature space scale representation.Several examples taken from Portuguese roads are used to demonstrate the effectiveness of the system.","PeriodicalId":325635,"journal":{"name":"2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services","volume":"179 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116456300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juan Carlos San Miguel, Jesús Bescós, J. Sanchez, Álvaro García-Martín
{"title":"DiVA: A Distributed Video Analysis Framework Applied to Video-Surveillance Systems","authors":"Juan Carlos San Miguel, Jesús Bescós, J. Sanchez, Álvaro García-Martín","doi":"10.1109/WIAMIS.2008.29","DOIUrl":"https://doi.org/10.1109/WIAMIS.2008.29","url":null,"abstract":"This paper describes a generic, scalable, and distributed framework for real-time video-analysis intended for research, prototyping and services deployment purposes. The architecture considers multiple cameras and is based on a server/client model. The information generated by each analysis module and the context information are made accessible to the whole system by using a database system. System modules can be interconnected in several ways, thus achieving flexibility. Two main design criteria have been low computational cost and easy component integration. The experimental results show the potential use of this system.","PeriodicalId":325635,"journal":{"name":"2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130312683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Transformation of MPEG-21 Metadata for Codec-agnostic Adaptation in Real-Time Streaming Scenarios","authors":"Michael Ransburg, Hubert Gressl, H. Hellwagner","doi":"10.1109/WIAMIS.2008.41","DOIUrl":"https://doi.org/10.1109/WIAMIS.2008.41","url":null,"abstract":"Scalable media contents, such as the new MPEG-4 scalable video codec enable to easily retrieve different qualities of the media content by simply disregarding certain media segments. The MPEG-21-based codec-agnostic adaptation approach supports this concept by introducing anXML-based bitstream syntax description (BSD) which describes the different segments of a media content. Based on this BSD, an adaptation node can intelligently adapt any scalable media (i.e., remove specific media segments) without the need for codec-specific knowledge. The adaptation approach consists of 1) transforming this BSD and 2) adapting the media based on the transformed BSD. In this paper, we focus on the BSD transformation step and evaluate different mechanisms w.r.t. their transformation efficiency given several application scenarios. In particular, we compare the traditional style sheet-based mechanisms with a novel mechanism based on regular expressions. We discuss both mechanisms in terms of their expressiveness, and propose how to actually employ regular expressions for codec-agnostic adaptation. Finally, we quantitatively evaluate these mechanisms in different adaptation scenarios, which vary in the size and number of required BSD units.","PeriodicalId":325635,"journal":{"name":"2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services","volume":"73 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130392228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Jarina, M. Paralic, M. Kuba, J. Olajec, Andrej Lukác, Miroslav Dzurek
{"title":"Development of a Reference Platform for Generic Audio Classification","authors":"R. Jarina, M. Paralic, M. Kuba, J. Olajec, Andrej Lukác, Miroslav Dzurek","doi":"10.1109/WIAMIS.2008.39","DOIUrl":"https://doi.org/10.1109/WIAMIS.2008.39","url":null,"abstract":"Detection of key sounds, such as applause, laugh, music, environmental noise, etc., is one of the challenges in intelligent management of multimedia information and content understanding. In this paper, we report progress in development of a reference content-based audio classification algorithm that is based on a conventional and widely accepted approach, namely signal parameterization by MFCC followed by GMM classification. Our developed labeled audio database and the conventional classification model should serve as a reference platform for an evaluation of novel, alternative or more advanced methods in audio content analysis.","PeriodicalId":325635,"journal":{"name":"2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131583516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Historical Artistic Documentaries","authors":"M. Zeppelzauer, D. Mitrovic, C. Breiteneder","doi":"10.1109/WIAMIS.2008.11","DOIUrl":"https://doi.org/10.1109/WIAMIS.2008.11","url":null,"abstract":"The paper introduces a novel interdisciplinary project addressing the analysis of historical artistic films. The type of employed material has not been subject to automatic analyses, so far. It poses challenges in all areas of content-based analysis and retrieval due to its complex temporal structure and due to substantial degradations. We propose robust features and a method for shot cut detection for this material that outperforms established techniques.","PeriodicalId":325635,"journal":{"name":"2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133984563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}