WISMM '14最新文献

筛选
英文 中文
Pushing Image Recognition in the Real World: Towards Recognizing Millions of Entities 在现实世界中推动图像识别:迈向识别数百万实体
WISMM '14 Pub Date : 2014-11-07 DOI: 10.1145/2661714.2661716
Xiansheng Hua
{"title":"Pushing Image Recognition in the Real World: Towards Recognizing Millions of Entities","authors":"Xiansheng Hua","doi":"10.1145/2661714.2661716","DOIUrl":"https://doi.org/10.1145/2661714.2661716","url":null,"abstract":"Building a system that can recognize \"what,\" \"who,\" and \"where\" from arbitrary images has motivated researchers in computer vision, multimedia and machine learning areas for decades. Significant progresses have been made in recently years based on distributed computation and/or deep neural networks techniques. However, it is still very challenging to realize a general purpose real world image recognition engine that has reasonable recognition accuracy, semantic coverage, and recognition speed.\u0000 In this talk, firstly we will review the current status of this area, analyze the difficulties, and discuss the potential solutions. Then two promising schemes to attack this challenge will be introduced, including (1) learning millions of concepts from search engine click logs, and (2) recognizing whatever you want without data labeling. The first work tries to build large-scale recognition models by mining search engine click logs. Challenges in training data selection and model selection will be discussed, and efficient and scalable approaches for model training and prediction will be introduced. The second work aims at building image recognition engines for any set of entities without using any human labeled training data, which helps generalize image recognition to a wide range of semantic concepts. Automatic training data generation steps will be presented, and techniques for improving recognition accuracy, which effectively leveraging massive amount of Internet data will be discussed. Different parallelization strategies for different computation tasks will be introduced, which guarantee the efficiency and scalability of the entire system. And last, we will discuss possible directions in pushing image recognition in the real world.","PeriodicalId":365687,"journal":{"name":"WISMM '14","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116328127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Storytelling with Big Multimedia Data: Keynote Talk 用大多媒体数据讲故事:主题演讲
WISMM '14 Pub Date : 2014-11-07 DOI: 10.1145/2661714.2661715
R. Jain
{"title":"Storytelling with Big Multimedia Data: Keynote Talk","authors":"R. Jain","doi":"10.1145/2661714.2661715","DOIUrl":"https://doi.org/10.1145/2661714.2661715","url":null,"abstract":"Big data is becoming increasingly multimedia data. Storytelling is one of the oldest and the most popular activity for humans. Historically, since the early days of human existence, storytelling has been used as a means of simple communication as well as medium of entertainment, education of people, cultural preservation, and instilling moral values through examples. A story is presentation of experiences related to events. Events and their experiences are selected to communicate the intent of a story compellingly. The art of storytelling always had close relationship to technology of the time. A good story considers the message and the audience and then selects appropriate events and proper related experiential media and information to weave a compelling and engaging account of the events.\u0000 There is a virtuous cycle between storytelling and the technology that is intertwined and synergistic. Historically, both have evolved together and are likely to continue evolving together in the near future. Most events of interest occur in physical world and must be captured using different sensors. Usually a single sensor is inadequate to capture diverse aspects of the event and hence the use of multiple sensors or media to capture an event and also to present event experiences for re-experiencing the events. Now we have diverse sensors to capture an event in all its details and use what will be compelling in storytelling.\u0000 A good story is the result of many activities: collection of data, analysis of data, selection of events and experiences that are relevant to the message, and a compelling presentation using this material. All of these activities are active research areas in multimedia big data. We discuss different forms of storytelling as they evolved and the role of technology in different stages of storytelling. We believe that now we have powerful tools and technologies to make the art of storytelling really effective. In this presentation we will show challenges for multimedia researchers that could make storytelling very effective and very compelling.","PeriodicalId":365687,"journal":{"name":"WISMM '14","volume":"302 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122244196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Large-Scale Aerial Image Categorization by Multi-Task Discriminative Topologies Discovery 基于多任务判别拓扑发现的大规模航空图像分类
WISMM '14 Pub Date : 2014-11-07 DOI: 10.1145/2661714.2661718
Yingjie Xia, Luming Zhang, Suhua Tang
{"title":"Large-Scale Aerial Image Categorization by Multi-Task Discriminative Topologies Discovery","authors":"Yingjie Xia, Luming Zhang, Suhua Tang","doi":"10.1145/2661714.2661718","DOIUrl":"https://doi.org/10.1145/2661714.2661718","url":null,"abstract":"Fast and accurately categorizing the millions of aerial images on Google Maps is a useful technique in multimedia applications. Existing methods cannot handle this task effectively due to two reasons. 1) It is challenging to build a realtime image categorization system, as some geo-aware Apps update over 20 aerial images per second. 2) The aerial images' topologies are the key to distinguish their categories, but they cannot be encoded by the generic visual descriptors. To solve these two problems, we propose an efficient aerial image categorization system, aiming at mining discriminative topologies of aerial images under a multi-task learning framework. Particularly, we first construct a region adjacency graph (RAG) that describes the topology of each aerial image. Thereby, aerial image categorization can be formulated as RAG-to-RAG matching. Based on graph theory, RAG-to-RAG matching is conducted by comparing all their respective graphlets (i.e., small subgraphs). Because the number of graphlets is huge, a multi-task feature selection algorithm is derived to discover topologies jointly discriminative to multiple categories. The discovered topologies are used to extract the discriminative graphlets. Finally, these graphlets are integrated into an AdaBoost model for predicting aerial image categories. Experiments show that our approach is competitive several existing recognition models. Further, over 24 aerial images are categorized per second, reflecting that our system is ready for real-world applications.","PeriodicalId":365687,"journal":{"name":"WISMM '14","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114527455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Social Popularity Score: Predicting Numbers of Views, Comments, and Favorites of Social Photos Using Only Annotations 社会人气得分:预测的意见,评论,和社会照片的收藏夹仅使用注释的数量
WISMM '14 Pub Date : 2014-10-02 DOI: 10.1145/2661714.2661722
T. Yamasaki, Shumpei Sano, K. Aizawa
{"title":"Social Popularity Score: Predicting Numbers of Views, Comments, and Favorites of Social Photos Using Only Annotations","authors":"T. Yamasaki, Shumpei Sano, K. Aizawa","doi":"10.1145/2661714.2661722","DOIUrl":"https://doi.org/10.1145/2661714.2661722","url":null,"abstract":"In this paper, we propose an algorithm to predict the social popularity (i.e., the numbers of views, comments, and favorites) of content on social networking services using only text annotations. Instead of analyzing image/video content, we try to estimate social popularity by a combination of weight vectors obtained from a support vector regression (SVR) and tag frequency. Since our proposed algorithm uses text annotations instead of image/video features, its computational cost is small. As a result, we can estimate social popularity more efficiently than previously proposed methods. Furthermore, tags that significantly affect social popularity can be extracted using our algorithm. Our experiments involved using one million photos on the social networking website Flickr, and the results showed a high correlation between actual social popularity and the determination thereof using our algorithm. Moreover, the proposed algorithm can achieve high classification accuracy with regard to a classification between popular and unpopular content.","PeriodicalId":365687,"journal":{"name":"WISMM '14","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126868145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信