MULTIMEDIA '99最新文献

筛选
英文 中文
Text enhancement in digital video using multiple frame integration 使用多帧集成的数字视频文本增强
MULTIMEDIA '99 Pub Date : 1999-10-30 DOI: 10.1145/319463.319466
Huiping Li, D. Doermann
{"title":"Text enhancement in digital video using multiple frame integration","authors":"Huiping Li, D. Doermann","doi":"10.1145/319463.319466","DOIUrl":"https://doi.org/10.1145/319463.319466","url":null,"abstract":"In this paper a multiple frame based technique to enhance text in digital video is presented. After extracting a reference text block, we use an image matching technique to find the corresponding text blocks in consecutive frames. We register these text blocks to subpixel levels by using image interpolation techniques to improve both correspondence and text resolution. The registered text blocks are averaged to obtain a new text block with a clean background and a higher resolution. Experiments conducted on several video sequences show that our enhancement scheme can improve the accuracy of commercial off-the-shelf OCR considerably.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130484323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 108
Automatic construction of personalized TV news programs 个性化电视新闻节目的自动构建
MULTIMEDIA '99 Pub Date : 1999-10-30 DOI: 10.1145/319463.319637
B. Mérialdo, Kyung-Tak Lee, D. Luparello, Jeremie Roudaire
{"title":"Automatic construction of personalized TV news programs","authors":"B. Mérialdo, Kyung-Tak Lee, D. Luparello, Jeremie Roudaire","doi":"10.1145/319463.319637","DOIUrl":"https://doi.org/10.1145/319463.319637","url":null,"abstract":"In this paper, we study the automatic construction of personalized TV News programs, where we want to build a program with predefined duration and maximum content value for a specific user. We combine video indexing techniques to parse TV News recordings into stories, and information filtering techniques to select stories which are most adequate given the user profile. We formalize the selection process as an optimization problem, and we study how to take into account duration in the selection of stories. Experiments show that a simple heuristic can provide high quality selection with little computation. We also describe two prototypes, which implement two different mechanisms for the construction of user profiles:explicit specification, using a category-based model,\u0000implicit specification, using a keyword-based model.\u0000","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123552614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 104
Dynamic frame rate control for video streams 视频流的动态帧率控制
MULTIMEDIA '99 Pub Date : 1999-10-30 DOI: 10.1145/319463.319481
S. Pejhan, Tihao Chiang, Ya-Qin Zhang
{"title":"Dynamic frame rate control for video streams","authors":"S. Pejhan, Tihao Chiang, Ya-Qin Zhang","doi":"10.1145/319463.319481","DOIUrl":"https://doi.org/10.1145/319463.319481","url":null,"abstract":"A mechanism for dynamically varying the frame rate of pre-encoded video clips is described. An off-line encoder creates a high quality bitstream encoded at 30 fps, as well as separate files containing motion vectors for the same clip at lower frame rates. An on-line encoder decodes the bitstream (if necessary) and re-encodes it at lower frame-rates in real-time using the pre-computed, stored motion information. Dynamic Frame Rate Control, used in conjunction with dynamic bit-rate control, allows clients to solve the rate mismatch between the bandwidth available to them and the bit-rate of the pre-encoded bitsream. It also provides a means for implementing Fast Forward control for video streaming without increasing bandwidth consumption.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122993007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Query refinement for multimedia similarity retrieval in MARS 面向MARS多媒体相似度检索的查询细化
MULTIMEDIA '99 Pub Date : 1999-10-30 DOI: 10.1145/319463.319613
Kriengkrai Porkaew, K. Chakrabarti
{"title":"Query refinement for multimedia similarity retrieval in MARS","authors":"Kriengkrai Porkaew, K. Chakrabarti","doi":"10.1145/319463.319613","DOIUrl":"https://doi.org/10.1145/319463.319613","url":null,"abstract":"During the past few years, content-based multimedia retrieval has become one of the most active areas of research. Unlike traditional database queries, content-based multimedia retrieval queries are imprecise in nature which makes it di cult for users to express their exact information need in the form of a precise query right away. A typical interface allows the user to express her information need by selecting examples of objects similar to the ones she wishes to retrieve. Such a user interface requires mechanisms to learn the query representation from the examples. In this paper, we present the query re nement approach used in the Multimedia Analysis and Retrieval System (MARS) for learning query representations through relevance feedback. The proposed technique uses query expansion towards modifying the query representation. In query expansion, in each iteration of feedback, the relevant objects are added to the query and non-relevant ones are removed. We compare it with approaches based on query point movement proposed in our previous work. We propose e cient query evaluation techniques for processing similarity queries and re ned queries in MARS. Our experiments show that query expansion signi cantly outperforms the query point movement approach in both in terms of retrieval e ectiveness and execution cost.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128706743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 194
Modeling focus of attention for meeting indexing 会议索引的关注焦点建模
MULTIMEDIA '99 Pub Date : 1999-10-30 DOI: 10.1145/319463.319464
R. Stiefelhagen, Jie Yang, A. Waibel
{"title":"Modeling focus of attention for meeting indexing","authors":"R. Stiefelhagen, Jie Yang, A. Waibel","doi":"10.1145/319463.319464","DOIUrl":"https://doi.org/10.1145/319463.319464","url":null,"abstract":"Visual cues, such as gesturing, looking at each other or monitoring each others facial expressions, play an important role in meetings. Such information can be used for indexing of multimedia meeting recordings. In this paper, we present an approach to detect who is looking at whom during a meeting. Our proposal is to employ Hidden Markov Models to characterize participants’ focus of attention by using gaze information as well as knowledge about the number and positions of people present in a meeting. The number and positions of the participants faces are detected in the field of view of a panoramic camera. We use neural networks to estimate the directions of participants’ gaze from camera images. We discuss the implementation of the approach in detail including system architecture, data collection, and evaluation. The system has achieved an accuracy rate of up to 93 % in detecting focus of attention on test sequences taken from meetings. We have used focus of attention as an index in a multimedia meeting browser.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"CE-31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126544592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Robust compression and transmission of MPEG-4 video 鲁棒压缩和传输MPEG-4视频
MULTIMEDIA '99 Pub Date : 1999-10-30 DOI: 10.1145/319463.319478
S. Gringeri, R. Egorov, K. Shuaib, A. Lewis, B. Basch
{"title":"Robust compression and transmission of MPEG-4 video","authors":"S. Gringeri, R. Egorov, K. Shuaib, A. Lewis, B. Basch","doi":"10.1145/319463.319478","DOIUrl":"https://doi.org/10.1145/319463.319478","url":null,"abstract":"This paper discusses issues related to the delivery of MPEG-4 video over the Internet and wireless channels. MPEG-4's built-in error resilience capabilities such as flexible re-synchronization markers, data partitioning, header protection, reversible VLCs, and forced intra-frame refresh are described. Methods for using these techniques to build a “smart” network decoder are discussed, and the decoder's video quality is measured for various channel error conditions. The effectiveness and overheads of the various error resilience techniques are compared using both peak signal-to-noise ratio measurements and expert viewing. The use of forward error correcting strategies and the effects of packet sizes and boundaries on video quality are also examined.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133530477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
Visualizing music and audio using self-similarity 利用自相似性将音乐和音频可视化
MULTIMEDIA '99 Pub Date : 1999-10-30 DOI: 10.1145/319463.319472
J. Foote
{"title":"Visualizing music and audio using self-similarity","authors":"J. Foote","doi":"10.1145/319463.319472","DOIUrl":"https://doi.org/10.1145/319463.319472","url":null,"abstract":"This paper presents a novel approach to visualizing the time structure of music and audio. The acoustic similarity between any two instants of an audio recording is displayed in a 2D representation, allowing identification of structural and rhythmic characteristics. Examples are presented for classical and popular music. Applications include content-based analysis and segmentation, as well as tempo and structure extraction.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122323931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 420
Geometrically correct imagery for teleconferencing 用于远程会议的几何正确图像
MULTIMEDIA '99 Pub Date : 1999-10-30 DOI: 10.1145/319463.319596
Ruigang Yang, M. S. Brown, W. Seales, H. Fuchs
{"title":"Geometrically correct imagery for teleconferencing","authors":"Ruigang Yang, M. S. Brown, W. Seales, H. Fuchs","doi":"10.1145/319463.319596","DOIUrl":"https://doi.org/10.1145/319463.319596","url":null,"abstract":"Current camera-monitor teleconferencing applications produce unrealistic imagery and break any sense of presence for the participants. Other capture/display technologies can be used to provide more compelling teleconferencing. However, complex geometries in capture/display systems make producing geometrically correct imagery difficult. It is usually impractical to detect, model and compensate for all effects introduced by the capture/display system. Most applications simply ignore these issues and rely on the user acceptance of the camera-monitor paradigm.\u0000This paper presents a new and simple technique for producing geometrically correct imagery for teleconferencing environments. The necessary image transformations are derived by finding a mapping between a capture and display device for a fixed viewer location. The capture/display relationship is computed directly in device coordinates and completely avoids the need for any intermediate, complex representations of screen geometry, capture and display distortions, and viewer location. We describe our approach and demonstrate it via several prototype implementations that operate in real-time and provide a substantially more compelling sense of presence than the standard teleconferencing paradigm.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121140179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
An RTP-based synchronized hypermedia live lecture system for distance education 基于rtp的远程教育同步超媒体直播讲座系统
MULTIMEDIA '99 Pub Date : 1999-10-30 DOI: 10.1145/319463.319475
Herng-Yow Chen, Y. Chia, Gin-Yi Chen, Jen-Shin Hong
{"title":"An RTP-based synchronized hypermedia live lecture system for distance education","authors":"Herng-Yow Chen, Y. Chia, Gin-Yi Chen, Jen-Shin Hong","doi":"10.1145/319463.319475","DOIUrl":"https://doi.org/10.1145/319463.319475","url":null,"abstract":"In this article, we have introduced a “Live Synchronized Hypermedia Live Lecture (SHLL) System” using RTP to synchronize the live presentation of streaming video lecture, HTML-based lecture notes, and HTML page Navigation Events. The SHLL framework consists of three major modules: (1) SHLL Recorder- for recording the temporal information of the AV lecture and the HTML-based lecture notes navigation processes. (2) SHLL Event Server- for receiving, depositing, and multicasting SHLL events. (3) SHLL Browser- for presentation of the synchronized AV lecture and HTML-based lecture notes navigation. To manage the synchronization presentation of different media, we have proposed an RTP-based Multi-Sync synchronization model, which account for the human perception factors. To evaluate the performance of the proposed SHLL framework and synchronization model, a RealSystem-based prototype Synchronized HTML-AV Distance Lecture System has been implemented using Java/JavaScript and C. The prototype system certifies the feasibility of the proposed framework for synchronized hypermedia live multicasting.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"365 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121408252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Multimedia access and retrieval (panel session): the state of the art and future directions 多媒体存取与检索(小组讨论):技术现状与未来方向
MULTIMEDIA '99 Pub Date : 1999-10-30 DOI: 10.1145/319463.319684
Shih-Fu Chang, Gwendal Auffret, J. Foote, Chung-Sheng Li, B. Shahraray, T. Syeda-Mahmood, HongJiang Zhang
{"title":"Multimedia access and retrieval (panel session): the state of the art and future directions","authors":"Shih-Fu Chang, Gwendal Auffret, J. Foote, Chung-Sheng Li, B. Shahraray, T. Syeda-Mahmood, HongJiang Zhang","doi":"10.1145/319463.319684","DOIUrl":"https://doi.org/10.1145/319463.319684","url":null,"abstract":"Several years have passed since the research topic of content based multimedia retrieval emerged. We have witnessed the burgeoning research activities into a plenitude of new indexing, retrieval, and filtering tools for images, video, audio, music, graphics, and their combinations with text-based information. Exciting research opportunities arise when integrating knowledge from multiple disciplines, such as media content processing, database, information retrieval, and machine user interface. In the commercial domain, we have also witnessed several impressive efforts moving technologies into practical arenas.","PeriodicalId":265329,"journal":{"name":"MULTIMEDIA '99","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133828629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信