{"title":"Generic Audio Classification Using a Hybrid Model Based on GMMs and HMMs","authors":"Menaka Rajapakse, L. Wyse","doi":"10.1109/MMMC.2005.44","DOIUrl":"https://doi.org/10.1109/MMMC.2005.44","url":null,"abstract":"A hybrid model comprised of Gaussian Mixtures Models (GMMs) and Hidden Markov Models (HMMs) is used to model generic sounds with large intra class perceptual variations. Each class has variable number of mixture components in the GMM. The number of mixture components is derived using the Minimum Description Length (MDL) criterion. The overall performance of the hybrid model was compared against models based on HMMs and GMMs with a fixed number of mixture components across all classes. We show that a hybrid model outperforms both class-based GMMs, HMMs, and GMMs based on fixed number of components. Further, our experiments revealed that the contribution of transitions between states in HMMs has no significant effect on the overall classification performance of generic sounds when large intra class perceptual variations are present among sounds in the training and test datasets. Sounds that show multi-event structure with events that tend to be similar (repetitive) indicated improved performance when modeled with HMMs that can be attributed to HMM’s state transition property. Conversely, GMMs indicate better performance when the sound samples show subtle or no repetitive behavior. These results were validated using the MuscleFish sound database.","PeriodicalId":121228,"journal":{"name":"11th International Multimedia Modelling Conference","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129118188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Perceptual Tempo Detection of Music","authors":"Bee Yong Chua, Guojun Lu","doi":"10.1109/MMMC.2005.49","DOIUrl":"https://doi.org/10.1109/MMMC.2005.49","url":null,"abstract":"Perceptual tempo refers to listener's tempo perception how fast the music goes, when he listens to a piece of music with fairly constant overall tempo. Existing work on automatically determining the tempo of a piece of music is usually not able to determine the Perceptual tempo. Therefore, our previous work on the Perceptual Tempo Estimator (PTE) improved on existing work and experimental results had shown its effectiveness in determining the perceptual tempo than the existing work. However there are music pieces that have overall uniform perceived tempo, but have some small parts (segments) with quite different tempo from other parts. The PTE may not work well for this type of music pieces if the energy level in these small segments is significantly high. In this paper, we propose an improved PTE (IPTE) to provide better tempo determination. Experimental results show that IPTE is more effective in determining the Perceptual tempo of the music signal with constant perceived tempo.","PeriodicalId":121228,"journal":{"name":"11th International Multimedia Modelling Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129632048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WS-QBE: A QBE-Like Query Language for Complex Multimedia Queries","authors":"I. Schmitt, Nadine Schulz, Thomas Herstel","doi":"10.1109/MMMC.2005.72","DOIUrl":"https://doi.org/10.1109/MMMC.2005.72","url":null,"abstract":"The visual database query language QBE (query by example) is a classical, declarative query language based on the relational domain calculus. However, due to insufficient support of vagueness QBE is not an appropriate query language for formulating similarity queries required in the context of multimedia databases. In this work we propose the query language WS-QBE which combines a schema to weight query terms as well as concepts from fuzzy logic and QBE into one language. WS-QBE enables a visual, declarative formulation of complex similarity queries. The semantics of WS-QBE is defined by a mapping of WS-QBE queries onto the similarity domain calculus SDC which is proposed here, too.","PeriodicalId":121228,"journal":{"name":"11th International Multimedia Modelling Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123714449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast Screening in Large Face Databases Using Merit-Based Dominant Points","authors":"Yongsheng Gao","doi":"10.1109/MMMC.2005.38","DOIUrl":"https://doi.org/10.1109/MMMC.2005.38","url":null,"abstract":"Current face identification approaches require computer systems to search through large quantity of face feature sets in the database and pick the ones that best match the features of an unknown input face. In this paper, a fast screening method for large face database searching is proposed. The method utilizes dominant points instead of edge maps as features for similarity measurement. A new formulation of Hausdorff distance is designed for merit-based dominant point matching. The screening experiments demonstrated that the proposed face screening method significantly improves the computational speed and the storage economy. It provides a very efficient way for large face databases searching and screening.","PeriodicalId":121228,"journal":{"name":"11th International Multimedia Modelling Conference","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121257258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Community-Based Recommendation System to Reveal Unexpected Interests","authors":"J. Kamahara, T. Asakawa, S. Shimojo, H. Miyahara","doi":"10.1109/MMMC.2005.5","DOIUrl":"https://doi.org/10.1109/MMMC.2005.5","url":null,"abstract":"Current collaborative filtering can't represent various aspects of users' interests. We propose a recommendation method in which a user can find new interests that are partially similar to the user's taste. Partial similarity is an aspect of the user's preference which is projected by the community in which the user belongs. We developed a television program recommendation system which performs such recommendation with serendipity, conducted an actual experiment and evaluated its results.","PeriodicalId":121228,"journal":{"name":"11th International Multimedia Modelling Conference","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115932764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Fuzzy Expert System for Concept-Based Image Indexing and Retrieval","authors":"I. A. Azzam, C. Leung, J. F. Horwood","doi":"10.1109/MMMC.2005.8","DOIUrl":"https://doi.org/10.1109/MMMC.2005.8","url":null,"abstract":"Image indexing and retrieval using a concept-based approach involves extraction, modelling and indexing of image content information. Computer vision offers a variety of techniques for searching images in large collections. We propose a method that enables components of an image to be categorised on the basis of their relative importance in combination with filtered representations. Our method concentrates on matching subparts of images, defined in a variety of ways, in order to find particular objects. These ideas are illustrated with a variety of examples. We focus on Concept-based Image Indexing and Retrieval (CIIR), using a fuzzy expert systems, density measure, supporting factors and other attributes of image components to identify and retrieve images accurately and efficiently.","PeriodicalId":121228,"journal":{"name":"11th International Multimedia Modelling Conference","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116673301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Browsing Texture Image Databases","authors":"Suryani Lim, Lianping Chen, Guojun Lu, Ray Smith","doi":"10.1109/MMMC.2005.25","DOIUrl":"https://doi.org/10.1109/MMMC.2005.25","url":null,"abstract":"The MPEG-7 standard defines two types of texture features: texture retrieval descriptor (TRD) for retrieval and texture browsing descriptor (TBD)for browsing. The retrieval process is straightforward but it is unclear how one could use TBD for browsing. This paper describes two methods of generating layouts for browsing a texture image database. The layouts are then subject to quantitative and qualitative evaluations. The experiments showed that: (1) only some features of TBD are appropriate for browsing, (2) once the inappropriate features are removed TBD is good for browsing only if the textures are structured, (3) the layouts generated usingTRDare more suitable for browsing.","PeriodicalId":121228,"journal":{"name":"11th International Multimedia Modelling Conference","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124047750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Wireless Layered Multicast Congestion Control Protocol for Multimedia","authors":"Dongsheng Yin, F. Zhang, Guangzhao Zhang","doi":"10.1109/MMMC.2005.16","DOIUrl":"https://doi.org/10.1109/MMMC.2005.16","url":null,"abstract":"In this paper, a Receiver-Based Wireless Layered Multicast Congestion Control Protocol for Multimedia is proposed, which is called RWLM. The congestion detection mechanism, receiver state machine and congestion control algorithm are described in detail. The experiment results prove that: RWLM not only achieves excellent performance in TCP-Friendliness over low bit error ratio wireless channel, but also keeps stable receiving rate over high bit error ratio wireless channel.","PeriodicalId":121228,"journal":{"name":"11th International Multimedia Modelling Conference","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126505503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semantic-Sensitive Classification for Large Image Libraries","authors":"Jialie Shen, J. Shepherd, A. Ngu","doi":"10.1109/MMMC.2005.66","DOIUrl":"https://doi.org/10.1109/MMMC.2005.66","url":null,"abstract":"With advances in multimedia technology, image data with various formats is is becoming available at an explosive rate from various domain applications. How to efficiently organise and access them has been an extremely important issue and enjoying growing attention. In this paper, we present results from experimental studies investigating performance of image classification for a novel dimension reduction scheme with hybrid architecture. We demonstrate that not only can the method provide superior quality of classification accuracy with various machine learning based classifier but also substantially speed up training and categorisation process. Moreover, it is fairly robust against various kinds of visual distortions and noises.","PeriodicalId":121228,"journal":{"name":"11th International Multimedia Modelling Conference","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126333923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling of Output Constraints in Multimedia Database Systems","authors":"Thomas Heimrich","doi":"10.1109/MMMC.2005.54","DOIUrl":"https://doi.org/10.1109/MMMC.2005.54","url":null,"abstract":"Constraints are used in traditional database systems to define consistent database states. For multimedia data it is also important to define constraints for a correct data output. The producer of multimedia data should specify constraints for a correct data output. We show the modeling of output constraints. The multimedia database system must guarantee that the stored multimedia data and the data output of multimedia data are according to the defined output constraints. For that an efficient check of output constraints must be possible.","PeriodicalId":121228,"journal":{"name":"11th International Multimedia Modelling Conference","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127679111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}