{"title":"A live multimedia stream querying system","authors":"B. Liu, Amarnath Gupta, R. Jain","doi":"10.1145/1160939.1160950","DOIUrl":"https://doi.org/10.1145/1160939.1160950","url":null,"abstract":"Querying live media streams captured by various sensors is becoming a challenging problem, due to the data heterogeneity and the lack of a unifying data model capable of accessing various multimedia data and providing reasonable abstractions for the query purpose. In this paper we propose a system that enables directly capturing media streams from sensors and automatically generating more meaningful feature streams that can be queried by a data stream processor. The system provides an effective combination between extendible digital processing techniques and general data stream management research.","PeriodicalId":346313,"journal":{"name":"Computer Vision meets Databases","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124879269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yasuo Yamane, T. Hoshiai, H. Tsuda, Kaoru Katayama, Manabu Ohta, H. Ishikawa
{"title":"Multi-vector feature space based on pseudo-euclidean space and oblique basis for similarity searches of images","authors":"Yasuo Yamane, T. Hoshiai, H. Tsuda, Kaoru Katayama, Manabu Ohta, H. Ishikawa","doi":"10.1145/1039470.1039479","DOIUrl":"https://doi.org/10.1145/1039470.1039479","url":null,"abstract":"Investigators have tried to increase the precision of similarity searches of images by using distance functions that reflect the similarity of features. When the quadratic-form distance is used, however, dissimilar images can be judged to be similar. We therefore propose that the similarity of images be evaluated using a measure of distance in a multi-vector feature space based on pseudo-Euclidean space and an oblique basis (MVPO). In this space an image is represented by a set of vectors each of which represents each feature. And we propose a distance (called D-distance) between two sets of vectors. Roughly speaking, it is the distance between solids.Another representative distance used in similarity searches is the Earth Mover's Distance (EMD). It can be formalized using MVPO, and that explains well why EMD outperforms quad-ratic-form distance. The main difference between EMD and D-distance is that EMD is based on partial matching and D-distance is based on total matching.We also discuss performance issues of MPVO and D-distance to address practical use of them.","PeriodicalId":346313,"journal":{"name":"Computer Vision meets Databases","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129507139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicolas Moënne-Loccoz, Bruno Janvier, S. Marchand-Maillet, E. Bruno
{"title":"Managing video collections at large","authors":"Nicolas Moënne-Loccoz, Bruno Janvier, S. Marchand-Maillet, E. Bruno","doi":"10.1145/1039470.1039484","DOIUrl":"https://doi.org/10.1145/1039470.1039484","url":null,"abstract":"Video document retrieval is now an active part of the domain of multimedia retrieval. However, unlike for other media, the management of a collection of video documents adds the problem of efficiently handling an overwhelming volume of temporal data. Challenges include balancing efficient content modeling and storage against fast access at various levels. In this paper, we detail the framework we have built to accommodate our developments in content-based multimedia retrieval. We show that not only our framework facilitates the developments of processing and indexing algorithms but it also opens the way to several other possibilities such as rapid interface prototyping or retrieval algorithms benchmarking. In this respect, we discuss our developments in relation to wider contexts such as MPEG-7 and the TREC Video Track.","PeriodicalId":346313,"journal":{"name":"Computer Vision meets Databases","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125077388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nearest neighbor search on multimedia indexing structures","authors":"T. Seidl","doi":"10.1145/1039470.1039474","DOIUrl":"https://doi.org/10.1145/1039470.1039474","url":null,"abstract":"Multimedia databases get larger and larger in our days, and this trend is expected to continue in the future. There are various aspects that affect the demand for efficient database techniques to manage the flood of multimedia data, namely the increasing number of objects, the increasing complexity of objects, and the emergence of new query types. Whereas traditional indexing structures cope with large numbers of simple objects, complex multimedia objects require more sophisticated indexing techniques. In the tutorial, we discuss characteristics of multimedia data and multimedia queries including similarity range queries and k-nearest neighbor queries. The main focus is on efficient processing of k-NN queries in various settings and includes direct k-NN search on indexes, multi-step k-NN query processing for complex distance functions and methods for high-dimensional spaces.","PeriodicalId":346313,"journal":{"name":"Computer Vision meets Databases","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122249040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Tamer Özsu, J. Carrive, S. Gilles, Izabela Grasland, R. Mohr, T. Seidl
{"title":"Future applications and solutions","authors":"M. Tamer Özsu, J. Carrive, S. Gilles, Izabela Grasland, R. Mohr, T. Seidl","doi":"10.1145/1039470.1039486","DOIUrl":"https://doi.org/10.1145/1039470.1039486","url":null,"abstract":"While the technical solutions developed by Computer Vision and Database researchers are often elegant and well designed, it is not clear that they are always able to solve the actual problems that users of image and multimedia databases are facing. Users range from professional users to leisurely users, although with the improvements in digital cameras, even leisurely users may quickly accumulate tens of thousands of images. Overall, these users are likely to vary significantly in what they are trying to achieve, what data they manipulate, how much data they deal with, which tools they use, and so on. Many works in Computer Vision and Databases, however, deal only with a single application, frequently even working with artificially generated data. On the other hand, the users may not be aware of the great technical solutions, which might well solve some of their problems, if appropriately applied.The goal of this panel is therefore to be a forum for exchanging ideas on the applications of image and video data. The panel will include professional users that deal everyday with huge volumes of data, but are using that data in very different ways. These people can clearly describe what kind of tools they would need to facilitate the management of their large volumes of multimedia data. The panel will also include Computer Vision and Database researchers that typically address technical issues such as enhancing image recognition or designing faster systems.","PeriodicalId":346313,"journal":{"name":"Computer Vision meets Databases","volume":"198 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132683926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On scalability of active learning for formulating query concepts","authors":"Wei-Cheng Lai, Kingshy Goh, E. Chang","doi":"10.1145/1039470.1039477","DOIUrl":"https://doi.org/10.1145/1039470.1039477","url":null,"abstract":"Query-by-example and query-by-keyword both suffer from the problem of \"aliasing,\" meaning that example-images and keywords potentially have variable interpretations or multiple semantics. For discerning which semantic is appropriate for a given query, we have established that combining active learning with kernel methods is a very effective approach. In this work, we first examine active-learning strategies, and then focus on addressing the challenges of two scalability issues: scalability in dataset size and in concept complexity. We present remedies, explain limitations, and discuss future directions that research might take.","PeriodicalId":346313,"journal":{"name":"Computer Vision meets Databases","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132989185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A case study on array query optimisation","authors":"R. Cornacchia, A. V. Ballegooij, A. D. Vries","doi":"10.1145/1039470.1039476","DOIUrl":"https://doi.org/10.1145/1039470.1039476","url":null,"abstract":"The development of applications involving multi-dimensional data sets on top of a RDBMS raises several difficulties that are not directly related to the scientific problem being addressed. In particular, an additional effort is needed to solve the mismatch existing between the array-based data model typical of such computations and the set-based data model provided by the RDMBS. The RAM (Relational Array Mapping) system fills this gap, silently providing a mapping layer between the two data models. As expected though, a naive implementation of such an automatic translation cannot compete with the efficiency of queries written by an experienced programmer. In order to make RAM a valid alternative to expensive and time-consuming hand-written solutions, this performance gap should be reduced. We study a real-world application aimed at the ranking of multimedia collections to assess the impact of different implementation strategies. The result of this study provides an illustrative outlook for the development of generally applicable optimisation techniques.","PeriodicalId":346313,"journal":{"name":"Computer Vision meets Databases","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132059393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Event-based modeling and processing of digital media","authors":"Rahul Singh, Zhao Li, Pilho Kim, D. Pack, R. Jain","doi":"10.1145/1039470.1039478","DOIUrl":"https://doi.org/10.1145/1039470.1039478","url":null,"abstract":"Capture, processing, and assimilation of digital media-based information such as video, images, or audio requires a unified framework within which signal processing techniques and data modeling and retrieval approaches can act and interact. In this paper we present the rudiments of such a framework based on the notion of \"events\". This framework serves the dual roles of a conceptual data model as well as a prescriptive model that defines the requirements for appropriate signal processing. Amongst the key advantages of this framework, lies the fact that it fundamentally brings together the traditionally diverse disciplines of databases and (various areas of) digital signal processing. In addition to the conceptual event-based framework, we present a physical implementation of the event model. Our implementation specifically targets the problem of processing, storage, and querying of multimedia information related to indoor group- oriented activities such as meetings. Such multimedia information may comprise of video, image, audio, and text-based data. We use this application context to illustrate many of the practical challenges that are encountered in this area, our solutions to them, and the open problems that require research across databases, computer vision, audio processing, and multimedia.","PeriodicalId":346313,"journal":{"name":"Computer Vision meets Databases","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114137499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A multimedia data base browsing system","authors":"Massimiliano Albanese, C. Cesarano, A. Picariello","doi":"10.1145/1039470.1039481","DOIUrl":"https://doi.org/10.1145/1039470.1039481","url":null,"abstract":"Browsing large multimedia databases is becoming a challenging problem, due to the availability of great amounts of data and the complexity of retrieval. In this paper we propose a system that assists a user in browsing a digital collection making useful recommendations. The system combines computer vision techniques and taxonomic classifcations to measure the similarity between objects and adopts an innovative strategy to take into account user behavior.","PeriodicalId":346313,"journal":{"name":"Computer Vision meets Databases","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131800551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image + database ≠ image database","authors":"R. Mohr","doi":"10.1145/1039470.1039472","DOIUrl":"https://doi.org/10.1145/1039470.1039472","url":null,"abstract":"For decades, computer vision researchers have been trying to extract high level information from images. While the semantics of images is still unreachable from the signal in most real cases, users would like to express requests to image data bases using high-level queries. This gap between the user needs and the image processing capabilities will limit the use of image databases in the near to mid term future.Meaningful applications, however, are already possible using existing scientific technology, for instance using query-by-example. The scalability of such applications stresses the need for: new indexing methods able to handle approximate measures from the image signal; approximate search methods that are efficient in high dimensional spaces; and robust search methods able to handle many partially erroneous data (outliers).The tutorial will illustrate some limited answers to these open problems using invariant features, robust statistics and probabilistic matching. It will then focus on the long term goal of high level semantics extraction from images. This problem is as yet poorly defined: the semantics of an image is user dependent and nobody knows how to express it in a formal way. Some limited answers exist, however, and the tutorial will illustrate how learning mechanisms provide impressive initial results. Moreover learning can be linked to relevance feedback and therefore allows performing better user dependent search.","PeriodicalId":346313,"journal":{"name":"Computer Vision meets Databases","volume":"21 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120911196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}