Jun Yang, Wenyin Liu, HongJiang Zhang, Yueting Zhuang
{"title":"Thesaurus-aided approach for image browsing and retrieval","authors":"Jun Yang, Wenyin Liu, HongJiang Zhang, Yueting Zhuang","doi":"10.1109/ICME.2001.1237927","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237927","url":null,"abstract":"The current trend of image retrieval is to incorporate image semantics with visual features to enhance retrieval performance. Although many approaches annotate images with keywords and process query at the semantic level, they fail to explore the full potentials of semantics. This paper proposes thesaurus-aided approaches to facilitate semantics-based access to images. The contribution of our work are two-fold: constructing a dynamic semantic hierarchy (DSH) which supports flexible image browsing by semantic subjects, as well as formulating a semantic similarity metric to get incorporated with visual similarity to improve the accuracy of image retrieval Experiments conducted on the real-world images demonstrates the effectiveness of our approaches.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126761007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distortion-based packet marking for mpeg video transmission over diffserv networks","authors":"Juan Carlos De Martin, D. Quaglia","doi":"10.1109/ICME.2001.1237741","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237741","url":null,"abstract":"We present a distortion-based approach to packet classification for multimedia transmission over differentiatedservices packet networks. Instead of sending all traffic as premium or relying on a priori data partitioning, packets are individually examined and assigned to different service classes depending on the level of distortion that their loss would introduce at the decoder. Applied to video sequences encoded with the ISO MPEG-2 video coding standard, the proposed distortion-based packet marking scheme outperforms source-transparent techniques and provides substantial and consistent gains in PSNR over the regular best-effort case sending as little as 10% of the packets as premium traffic. Video samples are available at","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124863877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Content based retrieval of 3D cellular structures","authors":"S. Berretti, A. Bimbo, P. Pala","doi":"10.1109/ICME.2001.1237865","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237865","url":null,"abstract":"Recent advances in management of multimedia digital libraries enable effective retrieval of information in the form of audio, image and video. However, retrieval of information in the form of 3D objects has received limited attention so far. Yet many archives of 3D objects already exist and are expected to grow both in relevance and size. In this paper, we address the problem of effective description and retrieval of 3D data representing intracellular structures. These structures are represented in the form of image stacks, being an image stack a set of 2D images representing planar sections of a cellular body at different heights. In the proposed method, 2D visual feature descriptors and Hidden Markov Models are combined to obtain a representation model which is able to distinguish such intracellular structures as Golgi, nucleus, endoplasmic reticulum and lysosomes. Preliminary results are presented to show the effectiveness of the proposed representation model.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121677888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stephane H Maes, Rafah Hosn, Jan Kleindienst, T. Macek, T. Raman, L. Serédi
{"title":"A DOM-based MVC multi-modal e-business","authors":"Stephane H Maes, Rafah Hosn, Jan Kleindienst, T. Macek, T. Raman, L. Serédi","doi":"10.1109/ICME.2001.1237867","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237867","url":null,"abstract":"Modality: A particular type physical interface that can be perceived or interacted with by the user (e.g. voice interface, GUI display with keypad etc...) Multi-modal Browser: A browser that enables the user to interact with an application through different modes of intercation (e.g. typically: Voice and GUI). Accordingly a multi-modal-browser provides different moadlities for input and output Ideally it lets the user select at any time the modality that is the most appropriate to perform a particular interaction given this interaction and the users situation Thesis: By improving the user interface, we believe that multi-modal browsing will significantly accelerate the acceptance and growth of m-Commerce. Multiple access mechanisms One interaction mode per device PC Standardized rich visual interface Not suitable for mobile use I need a direct flight from New York to San Francisco after 7:30pm today There are five direct flights from New York's LaGuardia airport to San Francisco after 7:30pm today: Delta flight nnn...","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121843122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analytical model-based bit allocation for wavelet coding with applications to multiple description coding and region of interest coding","authors":"P. Sagetong, Antonio Ortega","doi":"10.1109/ICME.2001.1237645","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237645","url":null,"abstract":"We address the problem of allocating bits to the different regions in an image coded with a progressive wavelet coder such as SPIHT (Set Partitioning in Hierarchical Trees) [1]. This type of problem appears in applications such as Region of Interest (ROI) coding or Multiple Description Coding (MDC). The wavelet coefficients in both cases are divided by different factors before coding to enable different bit allocation to different regions, because the coefficients in each region are refined at different speeds. While this is a popular approach for ROI coding [2, 3], we propose using it for MDC as well. In this work, we introduce a priority scaling factor ( ) as a dividing factor. The main contribution of this work is to provide an analytical technique to determine what the should be, given criteria such as relative importance of the regions in ROI coding or degree of redundancy in an MDC. Our approach is based on an approximation to Mallat’s model [4]. We show how our selection of is basically the same as that obtained by optimization of empirical data, with significantly less complexity.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128361643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Ntalianis, S. Ioannou, K. Karpouzis, G. Moschovitis, S. Kollias
{"title":"Visual information retrieval from annotated large audiovisual assets based on user profiling and collaborative recommendations","authors":"K. Ntalianis, S. Ioannou, K. Karpouzis, G. Moschovitis, S. Kollias","doi":"10.1109/ICME.2001.1237893","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237893","url":null,"abstract":"Current multimedia databases contain a wealth of information in the form of audiovisual and text data. Even though efficient search algorithms have been developed for either media, there still exists the need for abstract data presentation and summarization. Moreover, retrieval systems should be capable of providing the user with additional information related to the specific subject of the query, as well as suggest other, possibly interesting topics. In this paper, we present a number of solutions to these issues, giving an integrated architecture as an example, along with notions that can be smoothly integrated in MPEG-7 compatible multimedia database systems. Initially, video sequences are segmented into shots and they are classified in a number of predetermined categories, which are used as a basis for user profiles, enhanced by relevance feedback. Moreover, this clustering scheme assists the notion of \"lateral\" links that enable the user retrieve data of similar nature or content to those already returned. In addition to this, the system is able to \"predict\" information that is possibly relevant to specific users and present it along with the returned results.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115025415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stationary background generation in mpeg compressed video sequences","authors":"R. S. Aygün, A. Zhang","doi":"10.1109/ICME.2001.1237817","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237817","url":null,"abstract":"The development of the new video coding standard, MPEG- 4, has triggered many video segmentation algorithms that address the generation of the video object planes (VOPs). The background of a video scene is one kind of VOPs where all other video objects are layered on. In this paper, we propose a method for the generation of the stationary background in a MPEG compressed video sequence. If the objects move frequently and all the components of the background are visible in the video sequence, the background macroblocks can be constructed by using Discrete Cosine Transform (DCT) DC coefficients of the blocks. After the generation of the stationary background, the moving objects can be extracted by taking the difference between the frames and the background.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129870768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Handling large real-time disk access requests with variable priorities","authors":"Mahfuzur Rahman, Khaled M. Elbassioni, I. Kamel","doi":"10.1109/ICME.2001.1237768","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237768","url":null,"abstract":"This paper addresses the problem of providing different levels of performance guarantee for disk I/O. In typical applications, disk requests are classified into different categories based on the required quality of service (QoS), which is usually characterized by a priority and a deadline for each request. Traditional algorithms usually service all high priority requests before low priority requests, and this may result in potential starvation for the low priority requests. In this paper, a disk-scheduling algorithm is introduced to provide such QoS guarantee and avoid starvation. Our target applications for this algorithm are non-linear editing systems for continuous data, where the block size is large enough to ignore the seek time. The proposed algorithm tries to service a request with lower priority and strict deadline only if servicing this request will not violate the deadline constraints of a higher priority request. Simulation experiments are presented to show the superiority of the proposed algorithm over the traditional ones.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129873975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tree-pruning listless zerotree coding","authors":"Wen-Kuo Lin, A. Moini, N. Burgess","doi":"10.1109/ICME.2001.1237745","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237745","url":null,"abstract":"Previously we have proposed a simple zerotree coding algorithm called Listless Zerotree Coding (LZC) that has a significantly lower coding memory requirement than SPIHT. However, LZC performs the SPIHT-like recursive tree search that produces reconstructed images of uneven visual quality at low bit-rates. Therefore, in this paper we propose a new LZC algorithm called Tree-Pruning Listless Zerotree Coding (TPLZC) that performs a raster tree search for a better reconstructed image quality. Nevertheless, the zerotree relation is no longer embedded in the raster tree search, so additional buffer memory will be required to store the matrix-wide zerotree relations. TPLZC utilizes a simple tree-pruning method and a flag bit-map to construct and store the entire zerotree structure. As a result, TPLZC exhibits not only a low coding memory requirement but also a low coding complexity.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134416954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic news video segmentation and categorization based on closed-captioned text","authors":"Weiyu Zhu, C. Toklu, S. Liou","doi":"10.1109/ICME.2001.1237850","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237850","url":null,"abstract":"In this paper, we present a novel statistical approach, called the weighted voting method, for automatic news video story categorization based on the closed captioned text. News video is initially segmented into stories using the demarcations in the closed captioned text, then a set of","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133795653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}