{"title":"The space between the images","authors":"L. Guibas","doi":"10.1145/2502081.2509857","DOIUrl":null,"url":null,"abstract":"Multimedia content has become a ubiquitous presence on all our computing devices, spanning the gamut from live content captured by device sensors such as smartphone cameras to immense databases of images, audio and video stored in the cloud. As we try to maximize the utility and value of all these petabytes of content, we often do so by analyzing each piece of data individually and foregoing a deeper analysis of the relationships between the media. Yet with more and more data, there will be more and more connections and correlations, because the data captured comes from the same or similar objects, or because of particular repetitions, symmetries or other relations and self-relations that the data sources satisfy. This is particularly true for media of a geometric character, such as GPS traces, images, videos, 3D scans, 3D models, etc. In this talk we focus on the \"space between the images\", that is on expressing the relationships between different mutlimedia data items. We aim to make such relationships explicit, tangible, first-class objects that themselves can be analyzed, stored, and queried -- irrespective of the media they originate from. We discuss mathematical and algorithmic issues on how to represent and compute relationships or mappings between media data sets at multiple levels of detail. We also show how to analyze and leverage networks of maps and relationships, small and large, between inter-related data. The network can act as a regularizer, allowing us to to benefit from the \"wisdom of the collection\" in performing operations on individual data sets or in map inference between them. We will illustrate these ideas using examples from the realm of 2D images and 3D scans/shapes -- but these notions are more generally applicable to the analysis of videos, graphs, acoustic data, biological data such as microarrays, homeworks in MOOCs, etc. This is an overview of joint work with multiple collaborators, as will be discussed in the talk.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2502081.2509857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Multimedia content has become a ubiquitous presence on all our computing devices, spanning the gamut from live content captured by device sensors such as smartphone cameras to immense databases of images, audio and video stored in the cloud. As we try to maximize the utility and value of all these petabytes of content, we often do so by analyzing each piece of data individually and foregoing a deeper analysis of the relationships between the media. Yet with more and more data, there will be more and more connections and correlations, because the data captured comes from the same or similar objects, or because of particular repetitions, symmetries or other relations and self-relations that the data sources satisfy. This is particularly true for media of a geometric character, such as GPS traces, images, videos, 3D scans, 3D models, etc. In this talk we focus on the "space between the images", that is on expressing the relationships between different mutlimedia data items. We aim to make such relationships explicit, tangible, first-class objects that themselves can be analyzed, stored, and queried -- irrespective of the media they originate from. We discuss mathematical and algorithmic issues on how to represent and compute relationships or mappings between media data sets at multiple levels of detail. We also show how to analyze and leverage networks of maps and relationships, small and large, between inter-related data. The network can act as a regularizer, allowing us to to benefit from the "wisdom of the collection" in performing operations on individual data sets or in map inference between them. We will illustrate these ideas using examples from the realm of 2D images and 3D scans/shapes -- but these notions are more generally applicable to the analysis of videos, graphs, acoustic data, biological data such as microarrays, homeworks in MOOCs, etc. This is an overview of joint work with multiple collaborators, as will be discussed in the talk.