{"title":"A System for Automatic Judgment of Offsides in Soccer Games","authors":"Sadatsugu Hashimoto, S. Ozawa","doi":"10.1109/ICME.2006.262924","DOIUrl":"https://doi.org/10.1109/ICME.2006.262924","url":null,"abstract":"In this paper, we propose a system for automatic judgment of offsides in soccer games. We detect and track players in fixed multi camera images and calculate the world coordinates of them. Furthermore, we do a formation analysis by classifying uniforms and calculate the position of an offside line. On the other hand, we calculate the 3D coordinates and the trajectories of a ball in world coordinates from the plane coordinates of a ball in multi cameras and recognize the moment of a play from the 3D trajectories of a ball. In addition, we make a judge player's interfering with play by analyzing the spatial relationship between a ball and players. Finally, we make an offside judgment by integrating these results. We apply our system to a real soccer match and demonstrate the availability of this system by showing the experimental results","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115865735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Rui, Eric Rudolph, Li-wei He, Rico Malvar, Michael F. Cohen, I. Tashev
{"title":"PING: a Group-to-Individual Distributed Meeting System","authors":"Y. Rui, Eric Rudolph, Li-wei He, Rico Malvar, Michael F. Cohen, I. Tashev","doi":"10.1109/ICME.2006.262737","DOIUrl":"https://doi.org/10.1109/ICME.2006.262737","url":null,"abstract":"Group-to-individual (G2I) distributed meeting is an important but understudied area. Because of the asymmetry between different parties in G2I meetings, it has two unique challenges: l)the remote participant tends to be ignored by the local participants; and 2) the remote participant has inferior audio, video, and data experience than the local participants. To address these issues, in this paper we present PING, a system explicitly designed for G2I distributed meetings that combines recent advances in both hardware, e.g., microphone arrays, remote person stand-in devices, and software, e.g., audio-video processing, to improve users' G2I meeting experience. We report how PING addresses the above two challenges and its system design and implementation","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132551391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
En Cheng, Feng Jing, Mingjing Li, Wei-Ying Ma, Hai Jin
{"title":"Using Implicit Relevane Feedback to Advance Web Image Search","authors":"En Cheng, Feng Jing, Mingjing Li, Wei-Ying Ma, Hai Jin","doi":"10.1109/ICME.2006.262895","DOIUrl":"https://doi.org/10.1109/ICME.2006.262895","url":null,"abstract":"Although relevance feedback has been extensively studied in content-based image retrieval in the academic area, no commercial Web image search engine has employed the idea. There are several obstacles for Web image search engines in applying relevance feedback. To overcome these obstacles, we proposed an efficient implicit relevance feedback mechanism. The proposed mechanism shows advantage over traditional relevance feedback methods in the following three aspects. Firstly, instead of enforcing the users to make explicit judgment on the results, our method regards user's click-through data as implicit relevance feedback which release burden from users. Secondly, a hierarchical image search results clustering algorithm is proposed to semantically organize the search results. Using the clustering results as features, our relevance feedback scheme could catch and reflect users' search intention precisely. Lastly, unlike traditional relevance feedback user interface which hardily substitutes subsequent results for previous ones, our method employed friendly recommendation rather than substitution to let the user narrow down on the refined images. To evaluate the implicit relevance feedback mechanism, comprehensive user studies were performed","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129985665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bong-Ho Lee, S. Park, Heejeong Kim, C. Ahn, S. Lee
{"title":"DMB (Digital Multimedia Broadcasting) Voice EPG Application","authors":"Bong-Ho Lee, S. Park, Heejeong Kim, C. Ahn, S. Lee","doi":"10.1109/ICME.2006.262781","DOIUrl":"https://doi.org/10.1109/ICME.2006.262781","url":null,"abstract":"Recently, mobile TV is becoming a mainstream service in mobile broadcasting scenario, where requires lots of mobile factors such as robust transmission, high performance and easy interface and so on. As mobile broadcasting services are being highlighted, the easy interface and appropriate application are increasingly in demand. In most mobile scenarios, where users may actually share their attention with other concurrent tasks and where highly integrated devices may have very limited physical characteristics, intuitive man-machine interfaces are key factors to successful applications. The EPG, an essential application in most digital broadcasting systems, is also not free from the easy interface and mobile factors. The conventional GUI driven EPG solutions are, sometimes, not appropriate to the mobile system aiming the mobile TV and rich interactive data services. In this paper we present voice enabled EPG application that features voice user interaction and dialog technology allowing the user to have a speech interaction with the terminal in navigating and searching any program or service. We illustrate an overall service framework addressing the content delivery and consuming architecture fitted to the DMB environment. Moreover, we propose and implement an agent platform by profiling the elements of VoiceXML and extending EPG related elements to enable the EPG functionalities","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130175061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Path-Diversity Overlay Retransmission Architecture for Reliable Multicast","authors":"W. Zeng, Yingnan Zhu, Haibin Lu, Hongbing Jiang","doi":"10.1109/ICME.2006.262872","DOIUrl":"https://doi.org/10.1109/ICME.2006.262872","url":null,"abstract":"IP-multicast is a bandwidth efficient transmission mechanism for group communications. Reliability in IP-multicast, however, poses a set of significant challenges. To address the reliability and scalability issues in IP-multicast, this paper proposes a novel overlay retransmission architecture that exploits path-diversity by taking advantages of both IP multicast and an overlay network. We show that the proposed path diversity overlay retransmission architecture has the potential to significantly improve the reliability, delay, playback quality, and scalability of IP-multicast based multimedia applications. The general concept of using P2P overlay networks to help improve the QoS performance of multimedia applications as illustrated in this paper is expected to have significant impact on the deployment of next generation multimedia services","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130220495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prefilter Control Scheme for Low bitrate TV Distribution","authors":"Ryoichi Kawada, A. Koike, Y. Nakajima","doi":"10.1109/ICME.2006.262952","DOIUrl":"https://doi.org/10.1109/ICME.2006.262952","url":null,"abstract":"In IP-based TV distribution, coding degradation is sometimes evident in critical scenes because the bit rate for compression is rather low. Prefiltering is an effective countermeasure since it replaces the coding noise with the degradation more difficult to detect visually, though it has the drawback that excessive smoothing might occur. This paper proposes a scene-adaptive method to control a prefilter separate from the encoder. By calculating block-wise motion-compensated predictive error variances and correlation coefficients, it estimates the coding noise as well as the potential improvement by prefiltering each frame, realizing a control scheme which performs prefiltering only when effective","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"221 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134268481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Rank based Metric of Anchor Models for Speaker Verification","authors":"Yingchun Yang, Min Yang, Zhaohui Wu","doi":"10.1109/ICME.2006.262726","DOIUrl":"https://doi.org/10.1109/ICME.2006.262726","url":null,"abstract":"In this paper, we present an improved method of anchor models for speaker verification. Anchor model is the method that represent a speaker by his relativity of a set of other speakers, called anchor speakers. It was firstly introduced for speaker indexing in large audio database. We suggest a rank based metric for the measurement of speaker character vectors in anchor model. Different from conventional metric methods which consider each anchor speaker equally and compare the log likelihood scores directly, in our method the relative order of anchor speakers is exploited to characterize target speaker. We have taken experiments on the YOHO database. The results show that EER of our method is 13.29% lower than that of conventional metric. Also, our method is more robust against the mismatching between test set and anchor set","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134354877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recognizing Commercials in Real-Time using Three Visual Descriptors and a Decision-Tree","authors":"R. Glasberg, Cengiz Tas, T. Sikora","doi":"10.1109/ICME.2006.262822","DOIUrl":"https://doi.org/10.1109/ICME.2006.262822","url":null,"abstract":"We present a new approach for classifying mpeg-2 video sequences as `commercial' or `non-commercial' by analyzing specific color, texture and motion features of consecutive frames in real-time. This is part of the well-known video-genre-classification problem, where popular TV-broadcast genres like cartoon, commercial, music, news and sports are studied. Such applications have also been discussed in the context of MPEG-7. In our method the extracted features from three visual descriptors are logically combined using a decision tree to produce a reliable recognition. The results demonstrate a high identification rate based on a large collection of 200 representative video sequences (40 `commercials' and 4*40 `non-commercials') gathered from free digital TV-broadcasting in Germany","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"3 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131641175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Methods for None Intrusive Delay Measurment for Audio Communication over Packet Networks","authors":"M. Zad-issa, Norbert Rossello, L. Pilati","doi":"10.1109/ICME.2006.262597","DOIUrl":"https://doi.org/10.1109/ICME.2006.262597","url":null,"abstract":"Measurement of the delay is an important and common problem in communication over packet networks. The end-to-end and the round trip delay are among the factors directly impacting the quality of service as well as the user satisfaction. Multimedia gateways or base stations that perform echo cancellation or suppression often rely on the round trip delay to enhance their performance or to reduce the computational complexity of echo processing logics. In this work, we present two none intrusive methods for delay estimation and tracking. Both methods find the delay using the actual audio signal that is sent through the network. The first approach uses the MDCT transformed domain coefficients of the signal while the second operates in a perceptual domain. Experiments illustrate that both schemes can track the end-to-end and the round trip delay under various network and signal conditions","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"2 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131726466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semantic Labeling of Multimedia Content Clusters","authors":"Jelena Tešić, John R. Smith","doi":"10.1109/ICME.2006.262825","DOIUrl":"https://doi.org/10.1109/ICME.2006.262825","url":null,"abstract":"In this paper we present a novel approach for labeling clusters of multimedia content that leverages supervised classification techniques in conjunction with unsupervised clustering. Recent research has produced significant results for automatic tagging of video content such as broadcast news. For example, powerful techniques have been demonstrated in the context of the NIST TRECVID video retrieval benchmark. However, the information needs of users typically span a range of semantic concepts. One of the challenges of these multimedia retrieval systems is to organize the video data in such a way that allows the user to most efficiently navigate the semantic space for the video data set. One important tool for video data organization is clustering. However, clustering results cannot be leveraged effectively when they are not labeled. We propose to build on clustering by aggregating the automatically tagged semantics. We propose and compare four techniques for labeling the clusters and evaluate the performance compared to human labeled ground-truth. We present examples of the cluster labeling results obtained on the BBC stock shots from the TRECVID-2005 video data set","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125217431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}