Proceedings. IEEE International Conference on Multimedia and Expo最新文献

筛选
英文 中文
Automatic feedback for content based image retrieval on the Web Web上基于内容的图像检索的自动反馈
Proceedings. IEEE International Conference on Multimedia and Expo Pub Date : 2002-11-07 DOI: 10.1109/ICME.2002.1035758
Y. Aslandogan, Clement T. Yu
{"title":"Automatic feedback for content based image retrieval on the Web","authors":"Y. Aslandogan, Clement T. Yu","doi":"10.1109/ICME.2002.1035758","DOIUrl":"https://doi.org/10.1109/ICME.2002.1035758","url":null,"abstract":"We address the problem of identifying images of persons in large collections, such as the Web, without an existing face image database. We describe a method and a system that automatically constructs an initial face image database for a person using textual evidence obtained from the Web, and then uses this database for identifying images of that person. The initial retrieval results are obtained via text/HTML analysis and face detection. An internal clustering process groups visually similar faces among these initial results and builds a facial database. This database is then used by a face recognizer. The outputs of the textual and visual evidence modules are combined using Dempster-Shafer (1976) evidence combination formula. We present the results of an experimental evaluation where the system was able to improve upon the detection-only method when text/HTML analysis performed poorly.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"82 1","pages":"221-224 vol.1"},"PeriodicalIF":0.0,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82064319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Image retrieval based on multi-scale edge model 基于多尺度边缘模型的图像检索
Proceedings. IEEE International Conference on Multimedia and Expo Pub Date : 2002-11-07 DOI: 10.1109/ICME.2002.1035627
P. Bao, Xianjun Zhang
{"title":"Image retrieval based on multi-scale edge model","authors":"P. Bao, Xianjun Zhang","doi":"10.1109/ICME.2002.1035627","DOIUrl":"https://doi.org/10.1109/ICME.2002.1035627","url":null,"abstract":"We propose a novel scheme for image retrieval using a wavelet based multi-scale edge model. All images in the database are decomposed into their multi-scale primal sketch and the background images respectively. The images are stored in the form of the extracted edge structures and background. The similarities between query image and the images in the database are measured based on the statistics of edges structures. The multi-scale edge modeling of image database can also be performed real-time to enable the image retrieval on arbitrary image databases. Experiment shows that the proposed scheme gives promising retrieval performance over the conventional retrieval methods.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"46 1","pages":"417-420 vol.2"},"PeriodicalIF":0.0,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82098336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Bit-plane error recovery via cross subband for image transmission in JPEG2000 JPEG2000图像传输中基于交叉子带的位面误差恢复
Proceedings. IEEE International Conference on Multimedia and Expo Pub Date : 2002-11-07 DOI: 10.1109/ICME.2002.1035740
Pei-Jun Lee, Liang-Gee Chen
{"title":"Bit-plane error recovery via cross subband for image transmission in JPEG2000","authors":"Pei-Jun Lee, Liang-Gee Chen","doi":"10.1109/ICME.2002.1035740","DOIUrl":"https://doi.org/10.1109/ICME.2002.1035740","url":null,"abstract":"For multimedia transmission over noisy channels, the error robustness of JPEG2000 evidently outperforms that of JPEG. Since JPEG2000 is based on the discrete wavelet transform (DWT), traditional error concealment algorithms for still images in the discrete cosine transform (DCT) domain are not suitable for JPEG2000. In JPEG2000, decoding is processed bitplane by bitplane. Any data loss occurring in the bitstream will affect the consequent bitplanes and their wavelet coefficients. To solve this problem, the JPEG2000 VM7.2 program replaces the missing wavelet coefficients by zeros. However, the replacement may affect lots of significant nonzero coefficients such that some high frequency components are lost. In this paper, we present a novel error concealment algorithm for image transmission in the bitplane base. The proposed algorithm recovers the damaged bitplane data according to the cross subband and undamaged bitplane information. The recovered wavelet coefficients are similar with error-free data. The objective results show that the proposed algorithm has 3/spl sim/8dB improvement than those without the error resilient mechanism. From a subjective viewpoint, the proposed algorithm can achieve much smoother edges on the reconstructed image using our concealment algorithm.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"40 1","pages":"149-152 vol.1"},"PeriodicalIF":0.0,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82292419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Self-optimized spectral correlation method for background music identification 背景音乐识别的自优化谱相关方法
Proceedings. IEEE International Conference on Multimedia and Expo Pub Date : 2002-11-07 DOI: 10.1109/ICME.2002.1035786
M. Abe, M. Nishiguchi
{"title":"Self-optimized spectral correlation method for background music identification","authors":"M. Abe, M. Nishiguchi","doi":"10.1109/ICME.2002.1035786","DOIUrl":"https://doi.org/10.1109/ICME.2002.1035786","url":null,"abstract":"This paper proposes a new method of detecting a known reference signal in an input signal highly corrupted by other sounds. One major application of the method is the identification of broadcast background music corrupted by speech. In this method, the reference signal is first decomposed into a number of small time-frequency components, and the maximum similarity between each component and the input is calculated. The similarities for all the components are then integrated by a voting method. Finally, the result is used to determine whether or not the reference exists in the input; and if it exists, to determine its position. Experiments on the identification of background music and the classification of similar TV commercials have shown that this method can identify 100% of target signals with an SNR of -10dB.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"183 1","pages":"333-336 vol.1"},"PeriodicalIF":0.0,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80462545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Optimized video streaming for networks with varying delay 针对不同延迟的网络优化了视频流
Proceedings. IEEE International Conference on Multimedia and Expo Pub Date : 2002-11-07 DOI: 10.1109/ICME.2002.1035431
S. Wee, Wai-tian Tan, J. Apostolopoulos, M. Etoh
{"title":"Optimized video streaming for networks with varying delay","authors":"S. Wee, Wai-tian Tan, J. Apostolopoulos, M. Etoh","doi":"10.1109/ICME.2002.1035431","DOIUrl":"https://doi.org/10.1109/ICME.2002.1035431","url":null,"abstract":"This paper presents a method for distortion-optimized streaming of predictively coded video over packet networks with varying delay. In networks with significant delay variations, coded video frames can arrive late at the decoder and miss their respective display deadlines. Furthermore, due to predictive coding, a late frame can also prevent a number of subsequent frames from being displayed properly, where the number of affected frames or degree of distortion depends on the particular coding dependencies of the late frame. In this paper, we present an optimized video streaming strategy based on frame reordering for networks with significant delay variations. This streaming strategy minimizes distortion by exploiting the fact that different late frames result in different degrees of distortion. We model the router-induced delay in a wired network with an analytical PDF and we model the link-layer retransmission delay of a wireless network with the 3GPP specification for W-CDMA radio link control. We compute the distortion for different frame reorderings using the network delay models and a source model that accounts for the prediction dependencies of predictively coded video. Our optimized streaming strategies are shown to reduce the number of late frames by 14 to 23% for the situations examined.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"83 1","pages":"89-92 vol.2"},"PeriodicalIF":0.0,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80650185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Interactive room acoustic rendering in real time 实时互动式房间声学渲染
Proceedings. IEEE International Conference on Multimedia and Expo Pub Date : 2002-11-07 DOI: 10.1109/ICME.2002.1035827
L. Savioja, T. Lokki, J. Huopaniemi
{"title":"Interactive room acoustic rendering in real time","authors":"L. Savioja, T. Lokki, J. Huopaniemi","doi":"10.1109/ICME.2002.1035827","DOIUrl":"https://doi.org/10.1109/ICME.2002.1035827","url":null,"abstract":"The goal of this paper is to give an overview of real-time room acoustic rendering. The approach is based on the source-medium-receiver model, in which we model sound sources, room acoustics, and a listener. The basic techniques for each of these are presented, but the main emphasis is on the room acoustic modeling and interactive auralization. As a case study we present the structure of the DIVA auralization system developed at the Helsinki University of Technology. In addition, we describe subjective evaluations made to our system. Finally, a discussion of some applications of virtual acoustics and their computational needs are given.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"90 1","pages":"497-500 vol.1"},"PeriodicalIF":0.0,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80680680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On model-based clustering of video scenes using scenelets 基于模型的视频场景聚类研究
Proceedings. IEEE International Conference on Multimedia and Expo Pub Date : 2002-11-07 DOI: 10.1109/ICME.2002.1035778
Hong Lu, Yap-Peng Tan
{"title":"On model-based clustering of video scenes using scenelets","authors":"Hong Lu, Yap-Peng Tan","doi":"10.1109/ICME.2002.1035778","DOIUrl":"https://doi.org/10.1109/ICME.2002.1035778","url":null,"abstract":"We propose in this paper a model-based approach to clustering video scenes based on scenelets. We define a video scenelet as a short consecutive sample of frames of a video sequence. The approach makes use of an unsupervised method to represent scenelets of a video with a concise Gaussian mixture model and cluster them into different video scenes according to their visual similarities. In particular the expectation-maximization algorithm is employed to estimate the unknown model parameters, and Bayesian information criterion is used to determine the optimal number and model of scene clusters in a principled manner. This approach is fundamentally different from many existing video clustering methods, as it does not require explicit knowledge of shot boundaries. Instead, the shot boundaries can also be obtained as a by-product of the scene clustering process. The proposed methods have been tested with various types of sports videos and promising results are reported in this paper.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"20 1","pages":"301-304 vol.1"},"PeriodicalIF":0.0,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82575496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Universal MPEG content access using compressed-domain system stream editing techniques 使用压缩域系统流编辑技术的通用MPEG内容访问
Proceedings. IEEE International Conference on Multimedia and Expo Pub Date : 2002-11-07 DOI: 10.1109/ICME.2002.1035419
Ching-Yung Lin, Belle L. Tseng, John R. Smith
{"title":"Universal MPEG content access using compressed-domain system stream editing techniques","authors":"Ching-Yung Lin, Belle L. Tseng, John R. Smith","doi":"10.1109/ICME.2002.1035419","DOIUrl":"https://doi.org/10.1109/ICME.2002.1035419","url":null,"abstract":"An MPEG system layer compressed-domain editing technique is proposed to facilitate the delivery and integration of multiple segments of MPEG files, residing on remote databases. Various multimedia applications, including retrieval and summarization, split MPEG files into small segments along shot boundaries and store them separately. This traditional method requires extra management and storage payload, provides only fixed segmentations, and may not be play smoothly. In order to solve this problem, our MPEG system-domain editing tool directly extracts video-audio information from the original MPEG sources and combines them to generate a single MPEG file. Manipulated wholly in the system bitstream domain, this method does not require decoding, re-encoding, and re-synchronization of audio and video data. Thus, it operates in real-time and provides great flexibility. This composite MPEG file can be transmitted and displayed through general Web interfaces. The proposed method is applied to our video retrieval, video summarization, and video editing systems, and has shown its great advantages.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"73-76 vol.2"},"PeriodicalIF":0.0,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82918977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
A cost-effective solution for eye-gaze assistive technology 一种具有成本效益的眼球注视辅助技术解决方案
Proceedings. IEEE International Conference on Multimedia and Expo Pub Date : 2002-11-07 DOI: 10.1109/ICME.2002.1035632
Fulvio Corno, L. Farinetti, I. Signorile
{"title":"A cost-effective solution for eye-gaze assistive technology","authors":"Fulvio Corno, L. Farinetti, I. Signorile","doi":"10.1109/ICME.2002.1035632","DOIUrl":"https://doi.org/10.1109/ICME.2002.1035632","url":null,"abstract":"The problem of assisting people with special needs is assuming a central role in our society, and information and communication technologies are asked to have a key role in aiding people with both physical and cognitive disabilities. This paper describes an eye tracking system, whose strong points are the simplicity and the consequent affordability of costs, designed and implemented to allow people with severe motor disabilities to use gaze as an input device for selecting areas on a computer screen. The motivation for this kind of input device, together with the communication impairments that it may help to solve are reported in the paper, that then describes the adopted technical solution, compared to existing approaches, and reports the results obtained by its experimentation.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"38 1","pages":"433-436 vol.2"},"PeriodicalIF":0.0,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90222735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Retrieval of articulate objects from images and video using invariant signatures 使用不变签名从图像和视频中检索清晰的对象
Proceedings. IEEE International Conference on Multimedia and Expo Pub Date : 2002-11-07 DOI: 10.1109/ICME.2002.1035757
Ronald-Bryan O. Alferez, Yuan-fang Wang
{"title":"Retrieval of articulate objects from images and video using invariant signatures","authors":"Ronald-Bryan O. Alferez, Yuan-fang Wang","doi":"10.1109/ICME.2002.1035757","DOIUrl":"https://doi.org/10.1109/ICME.2002.1035757","url":null,"abstract":"We propose a new method of retrieving multi-part, articulate objects from images and video. The scheme is particularly well suited for analyzing images and video for objects that can pose differently with possible shape deformation and articulated motion. The scheme involves computing an invariant signature for each segmented region in the image, in a manner that is insensitive to translation, rotation, scale, and shear. Using circular cross-correlation, these signatures can then be efficiently compared with that of user-defined regions of interest. Ambiguities between individual region matches are then resolved through relaxation labeling techniques. A final match is established when a collection of segmented regions conform to the query object, both in terms of local shape description and global structural relation. The scheme thus allows for articulated movement of object parts within the scene. The procedure is easy to implement, yet shows promising results in its ability to isolate interesting regions in images and video, to account for structural and relational constraints among regions, and to integrate both local shape and global structural information for a detailed examination of the scene in a way that is invariant to many visual variations.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"6 1","pages":"217-220 vol.1"},"PeriodicalIF":0.0,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89260330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信